News Flash!ODP Janitors' Union Formed
Strikes for a Cleaner Directory |
||
I recently got a note from an editor, saying "Thank you
for correcting that spelling error in Regional/Pangaea/Erewhon/Weather,
I've looked at it a hundred times and not seen that it was wrong. How did
you find it? Can you teach me your tricks?" Other editors, too,
have looked in their category edit logs, seen my footprints, and wondered
what brought me wandering by.
The answer is simple, of course: I take a copy of the RDF dump (the way the ODP publishes its database), print it out, feed it into a shredder, and let several thousand specially-trained penguins root through the debris and bring me fragments with spelling errors on them. That's what works for me, anyway. :-) If you don't happen to have access to trained penguins, you'll have to look for spelling errors yourself. The first habit to get into is to actually look at categories that you edit. It can be hard to find problems when you're accustomed to seeing them. For some people it may help to print out the public view of a category to look at, or to swap proofreading services with someone who edits in a different part of the directory. Another way is to cut and paste (or otherwise load) the public or edit view of a category into a word-processor, and use the built-in spell checker to find misspelled words. Scroll around to match the highlighted errors with the entry at fault, and edit it. Remember, though, that most spell-checking programs won't catch usage problems like typing "that" instead of "than", or using "it's" (it is) rather than "its" (possession or property of it.) The second big trick is to realize that errors aren't just random. There are patterns to them, whether it's common typos like "teh" instead of "the", or common misspellings like "freind" instead of "friend". Chances are, if you make a typo or misspell a word, you've made the same mistake before, and so have other people. (This is a special case of what used to be known on Usenet as "Ugol's Law" - "If you ever have to ask 'am I the only one who does <anything>?' the answer is invariably 'no.'") So, if you want to clean up a lot of errors in a hurry, instead of hunting through a category looking for errors, you might want to look for one specific error in lots of categories. Obviously this works better the higher up in the tree you have access, but even lower-level editors can do the searching, and then smile cheerfully as they pass the results on to other people to fix. (More on that later.) It used to be that there was only one way to automate hunting batches of spelling errors: Use the search box (at the top of every ODP page) to look for specific misspelled words, limited to the tree in question, open the category in a new window, click the "edit" link to get to the edit view, scroll around to find the site at fault, and edit it. Repeat the process for every site within the category, and every category that has search results. Needless to say, this is a slow and painful process. Finding and fixing errors is a lot easier with ODP Search for Editors, a tool written by editor newwave. Like all of the other productivity-boosters in the ODP Tools area, this automates simple repetitive tasks so that human energy can be focused on the things that only humans can do. It works the same way as the regular ODP search - enter one or more keywords, and optionally a category to restrict to. Results are displayed in batches of 20, but the links go to the edit view of the category or listing in question. Unfortunately, this also has the same limitations as the regular search, notably that the search index is only updated every day or so. You might open up all the edit windows and find that someone else fixed the errors an hour ago. Why would two editors be looking for the same error on the same day? Well, we all know that the ODP is a team effort, that we get better results by working together rather than all going off in different directions. Traditionally this cooperation has been among editors in the same category, or parallel parts of the directory structure, but in recent months there's been a concerted group effort among editors from a wide range of topics to improve overall directory quality, mostly by fixing spelling errors. This group has become known as the "janitor gang", and we're always looking for fresh meat, um, er, we always welcome new participants. :-) The primary turf of the janitor gang is the spelling threads in the General Forum. Look for a thread with "Spelling Errors" in the title; the current incarnation isSpelling Errors - But Wait, There's More!. Inside you will find lots of cryptic entries in this format: http://newwave.primeline.com/dmoz/search2.kizml?search=porgram* (3 of 4)Translated into English, what this means is there are currently four search matches for "porgram" or a longer word that starts with "porgram" (perhaps "porgrams" or "porgramming"), and that the poster felt that three of the four needed to be corrected. The fourth is to be excluded for some reason that is expected to be obvious - perhaps the "typo" is part of a functional URL, a proper name, or an acronym. The poster couldn't or didn't fix the errors directly, but published them for others to take care of. If you get in the habit of doing a global search every time you fix a typo or spelling error, you will find that many of them are, unfortunately, duplicated elsewhere. Even if you can't fix all of the occurrences, you can post a note in the "janitors' spelling thread" with your findings. If you are a lower-level editor, your responsibility ends there; you can walk away smugly, knowing that someone else will almost certainly jump in and make the necessary fixes, and the end result is a cleaner directory for everyone. Of course, since diligence and competence are their own punishments at the ODP, you may one day find yourself magically turned into one of those higher-level editors who are compelled to fix as many errors as they can... If you are a higher-level editor (and this can mean anything from an editall or top-level category editor, to the new editor who signs on one day and discovers that there's an even newer editor residing in a subcategory of the one they edit), there are several ways you can handle errors you find deeper in the tree:
While low-level hype and common bad usage may not be the most productive targets for global cleanup efforts, they should still be fixed up in areas where you edit directly, because we all should want to make our own special interests into the best category in the directory. :-) Of course, some editors are always going to say "spell-checking bores me", or "as soon as this reorg is done", or "right after I clear all the unreviewed." That's OK; we all have different editing styles, and different skills and talents we bring to the project. As long as your edits are good, they are helping to make this directory the best in the world, no matter how few or how many you do. Just remember that some people will discount all the hard work you do if they see spelling errors or sloppy editing, and a bad experience in one category will affect their willingness to use the ODP for another subject. Neatness counts. Stay tuned for Part II next month: Your Friendly Neighborhood Serial Killer. -- Pinkie |