Learning to Tag

Collin Brooke
5 min readSep 7, 2021
A department store window display of several headless mannequins wearing clothes that are covered in neon yellow price tags that look like post-it notes
“Price tags” by nicolasnova is licensed under CC BY 2.0

This semester, I’m teaching our program’s “601,” which bills itself as an introduction to the field. Most of our students already have a couple of years’ worth of writing studies under their belts, so I don’t think of the course as an exercise in content. Instead, I introduce the course with Gilbert Ryle’s distinction between know-that and know-how and explain that we’ll be focusing on the latter.

Without putting too fine a point on it, my feeling is that most graduate programs could do with a healthier ratio between descriptive/propositional and procedural knowledge. Our students graduate having read thousands upon thousands of pages of scholarship, but often, when it comes to learning how to produce their own, we leave them to their own devices. I can’t tell you how many times I’ve talked with students about the differences among seminar papers, journal articles, edited collections, and monographs, for example. Or the different types of reading that you must practice in graduate school (and later). Or building a sustainable process of note taking.

Anyhow, this week, we’ll be talking about Zotero and I’ll strongly encourage them to make use of it not only this semester but throughout their careers. As I was thinking about what I might be able to tell them that would be helpful, I reflected on the various discussions I’ve had over the years about tagging, assigning keywords and/or metadata to the entries in a database. As I was digging back through the things I’ve written about the topic, I came across this little gem from my blog, which doesn’t name Ryle for all that it uses his distinction:

…metadata contains, at its heart, a ratio between description and procedurality. That is, there is no description degree zero, no purely descriptive metadata, and I think that sometimes we fall into the trap of imagining that there is. It’s not enough to simply say “it’s both,” though–I think that claim is easy enough to accept. The procedural literacy of metadata lies perhaps in figuring out that variable ratio and tuning it to the task at hand. (http://www.cgbrooke.net/2014/02/18/metadata-procedurality-and-works-slighted/)

I think that, when we set about developing metadata, there’s a temptation to focus mostly on description, to mirror the data being referenced. There’s one school of thought that holds that a good system of metadata would enable different folks to arrive at the same set of tags for a particular article, for example. And there’s some truth behind that, honestly, if we understand the purpose and context for those tags in a prescriptive sense. That is, if we imagine a generic user searching a database for an essay on a particular topic, then a consistent, standardized set of tags makes a great deal of sense.

But as I encourage my students to build their own databases, it’s not about imagining that generic user. Instead, they’re their own audience, and I want them to consider the various tasks for which a rich, evolving database of scholarship is going to help them with. So, I want them to think about procedural tagging, that is, tags that don’t necessarily have anything to do with the objects themselves, but instead reference use and context. One obvious example: as they enter works from the courses they’re taking, they should tag them with the course number. So all of the things they read from my course should be tagged with 601. Being able to go back into a database and instantly retrieve all the readings from a particular course? I can imagine several ways that this would be useful.

When I was working on my book, and speed-reading sources, I used to tag things with the specific chapter where I thought that something would fit. I do something similar with the evolving cloud of possible projects that I’m thinking about at a given moment — some of them coalesce into actual projects while others dissipate, but this kind of procedural tagging helps my ideas percolate.

The other thing that I want them to think about when it comes to tagging is scale. This is a lesson that emerged for me when I was developing the online archive for my field’s flagship journal (RIP, cocoa). We relied on a combination of top-down categories and emergent tagging that relied on early digital humanities tools to process texts and extract word and phrase frequencies.

One of the things we discovered was that this process led to clusters of terms at various scales. In a corpus of roughly four hundred articles, some terms were so frequent (writing, rhetoric, pedagogy, literacy, et al.) that they became meaningless because they would yield too large a percentage of the database. And then were those that were so infrequent that they proved equally unhelpful — there was exactly one article that referenced fortune cookies, for instance. But there was a cluster of terms that hit that sweet spot for us. Interestingly enough, it was right around the square root (20) of our total number of articles. The tags that appeared from 15–25 times in our database proved remarkably helpful, and they tended to correspond to areas of the field or specific methodological approaches. If you wanted to know which articles from the journal discussed writing across the curriculum or historiography, our database was a good place to look.

The metaphor that I got into the habit of using was snow. Those near-universal terms generated an avalanche of results, while one-hit metadata felt like snowflakes (this was before the word acquired its contemporary, politicized connotations, of course). But those terms that were especially useful? Those were snowballs — it’s much easier to hit a target with a snowball than with the alternatives. And a snowball is crafted, compressed for that purpose. It’s a “tool” in a way that avalanches and snowflakes are not.

And in some ways, I suppose that brings me back to procedurality, to using even primarily descriptive metadata in a way that carries with it some intent and purpose. I want them to imagine the uses to which they’ll be putting their metadata and shaping it as they go to make it as productive as possible.

--

--

Collin Brooke

digital rhetorics professor at Syracuse University. rarely accused of underthinking it.