Hi, My Name is Junk Monkey.


The first program for the Manhattan Project will be the Junk Monkey Plug-In for Firefox. This proposed plug-in would take source code from websites, say The New York Times, and isolate the body text. The script would then analyze the text using linguistic and grammatical cues. After mining potentially interesting items, the program would then run cross-references with the individual items corresponding to its relevant database. Wikipedia or Google could be used as databases, for example. The program would work in a click on/off style, similar to the dashboard program for Mac OS X. If the user has a question, or wishes to learn more about the story, she can click on the program to show what links can be made or what follow ups can be made from the story. To ensure flexibility, the user can choose her preferences to which database she would like to use. This would be done on opening the program for the first time. The user would decide which database to use for pronoun searches (for example), choosing between Wikipedia, Google, Dictionary.com or Ask.com.

General Algorithm Outlines:

General Algorithm Outline


General Example

Health Report Example

Opinion Column Example

See an example here (limited while supplies last).


1 guidelines
2 graphic design and user interface
3 programming
4 resources


There are three main areas to be developed: Graphic Interface/User Experience, Programming and Resources. The topics are explained below and are linked to blank pages for discussion. Please find your strongest interest(s).

graphic design and user interface

Some crude pictures have been drawn up to illustrate the general idea for a graphic interface for the program. What are the benefits and disadvantages of a on/off function? Would a highlighted graphic interface be more productive (think Gmail Spellcheck)? Could these two both be present in the program?



Junk Monkey will be a plug-in for Firefox, but we want this to be able to easily fit into a future browser in the hopefully not-to-distant future. We would also like the program to be easily manipulated by outside users in open-source format. What language would best meet these needs?

In developing a plug-in, what is the hierarchy of technical aspects? As in: when programming a plug-in, what are the heavy-duty jobs compared to the ones that only take a few moments?



Junk Monkey relies on a network of databases. What are the most thorough and accessible databases for this sort of work? What are the different categories to choose databases when opening Junk Monkey for the first time? Is there some way to access register only databases (such as JStor) available to most college students? Would there also be a reason to host a new database for areas that are more scarce than others (news reporters and writers)?

The proposal talks about linguistic cues to link stories. What are some common phrases used to describe scientific reports, other news stories or political events? After developing these examples, these examples should be tested on stories from New York Times, CNN.com and FoxNews.com to test viability and multi-source flexibility.