Text cloud for two of my papers
I have two papers from my Ph.D. research currently in review – one in Sedimentology and one GSA Bulletin. The one in Sedimentology is getting very close to being ‘in press’ … I’m working with the associate editor right now clearing up some issues. The one for GSA Bulletin was submitted in late December, so I expect I’ll get the first reviews back in the next month or so.
I’m looking forward to posting about these projects on this blog, but I also feel it’s important to wait until they are published before I do that.
In the meantime, I was playing around with a tag cloud generator (the one I used is called TagCrowd) for each of the papers. I’m a big fan of tag clouds and other visualization approaches, in general, and was curious what the text from one of my scientific papers would look like.
Here’s the text cloud for the paper in Sedimentology:
Firstly, you can see how many times I have “et al” in there … those are among the most frequent. Same with the shorthand for figure as “fig”. The rest will remain a mystery for now :)
And, here’s the text cloud for the paper in GSA Bulletin:
This is pretty cool … you can get a decent idea about what the papers are about from these visualizations. Give it a shot … it might be cool to see how everyone around the geoblogosphere compares. Use a thesis chapter, proposal, paper for class, published paper, etc. Like I said, I used TagCrowd, but there are probably others. I found the best results by pasting the text in instead of uploading a file.
–
THE PUBLICATION TEXT-CLOUD MEME CAUGHT ON!:
Chuck from Lounge of the Lab Lemming
ReBecca over at Dinochick Blogs
Maria from Green Gabbro
Chris from Highly Allocthonous
Kim from All Of My Faults Are Stress-Related
Silver Fox from Looking For Detachment
Julian from Harmonic Tremors
Tuff Cookie from Magma Cum Laude
Dave from Geology News (this one shows results from his entire blog)
Callan from NOVA Geoblog
Lost Geologist from The Lost Geologist
–
… and the meme made it out of the geoblogosphere:
SPARC at molecular B(io)LOG(y)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Have you intentionally culled place names/ proper nouns?
I just put the text in as is with default settings … which ignores common words (the, and, of, etc.) … that’s it. I’m sure one could get more sophisticated, but I didn’t here.
Your deep-sea fan affinities shine through, though – Normark has a pretty prominent place in your second cloud!
Normark is the second author on that paper … my work was, in a way, an extension of some of his previous work for that particular submarine fan, so there’s a lot of referencing of it in the text.
OK, this thing is just way too much fun.
It’s a shame it only works for english texts. I don’t have any reports or so written on english. If I use my German reports from UNI, it doesn’t ignore the small and common words. So my cloud if full of und, ist, auch, also, das, der, die, das… :(
Very cool idea! I have nothing of my own to submit, but it’d be very interesting running newspaper articles through Tag Croud.
Lost Geologist … that is a shame … I’m surprised someone hasn’t made a German version somewhere.
Adam … you could run your own blog posts through it … see the trackback from Geology News, that’s exactly they did with neat results.
Lot’s of fun!
Maybe “Lost” could use the word list thingy of words to leave out to have it ignore most of the little/common German words.
Oops, forgot to say – you’re tagged for the Six Word Meme, if interested – and I’m getting a little memed-out!
I’ve been away from computer the last few days … will try and do the 6-word meme soon
Thiis was lovely to read