The clustering algorithm has to make a choice when it encounters a word like “cried.” The word might in practice have different sets of associations, based on different uses (weep/exclaim), but it’s got to go
The clustering algorithm has to make a choice when it encounters a word like “cried.” The word might in practice have different sets of associations, based on different uses (weep/exclaim), but it’s got to go in one branch or another. It can’t occupy multiple locations in the tree...
In addition to the MD5 algorithm, the md5deep suite provides for alter-native algorithms by providing additional utilities such as sha1deep, tigerdeep, sha256deep, and whirlpooldeep, all of which come included in the md5deep suite download. Sign in to download full-size image Figure 5.4. ...
Sort Select order Last updated Name Stars Showing 10 of 12 repositories sci2 Public The Science of Science (Sci2) Tool is a modular toolset specifically designed for the study of science. It supports the temporal, geospatial, topical, and network analysis and visualization of scholarly datase...
I’m going to press this point in detail, because it’s not just a metaphor: to produce a simple but useful topic-modeling algorithm, all you have to do is take a search engine and run it backwards. 2) The second argument is newer; I don’t think I’ve blogged about it yet. I...
but it does suggest that linguistic signals of “beginnings,”“middles,” and “ends” remained broadly similar from the early nineteenth century through the early twentieth. If we wanted to confirm that, we could make more direct comparisons, but for exploratory visualization I see how PCA is...
Here you start with an unlabeled collection of texts; you ask a learning algorithm to organize the collection by finding clusters or patterns of some loosely specified kind. You don’t necessarily know what patterns will emerge. If this sounds epistemologically risky, you’re not wrong. Since ...
Using an algorithm outlined by Jonathan Foote (2000), the authors translate this grid into a line plot where the dips represent musical “revolutions.” Trying the same thing on the history of the novel. Could we do the same thing for the history of fiction? The labor-intensive part would...
The answer is not determined by the technical limits of any algorithm. It depends, rather, on the size of the blind spots in our knowledge of the literary past — and it’s part of the definition of a blind spot that we don’t already know how big it is. How far do you have to...
Using an algorithm outlined by Jonathan Foote (2000), the authors translate this grid into a line plot where the dips represent musical “revolutions.” Trying the same thing on the history of the novel. Could we do the same thing for the history of fiction? The labor-intensive part would...