Corpus and the Web

Understanding how to navigate the internet most efficiently is not only important in today’s technical world, it is important in your day to day life whether technical or not. Svenja Adolphs tries to provide an understanding of how language is used in today’s media and how even search results can be greatly influenced by the words and patterns you use while searching. A good example Adolphs provides in her book “Introducing Electronic Text Analysis” is in using search engines, and how patterns and specific language will greatly influence your results. And furthermore your understanding of your original search.

Corpus Linguistics is the field of electronic text analysis, and understanding this form of text analysis will make us better communicators on the web and in general. In and article posted on WiseGeek.com they simplify the definition as “Parental diaries of a child’s speech as he first acquires language is a simple example of a corpus that can then be studied to learn language patterns.”

In that same article they begin to touch on the importance of corpus and search engines, “While searching patterns in a corpus of millions of words would take too much time for a human being and the results would be less than accurate, a computer can search and retrieve information in mere seconds. It can calculate frequency, sort data and exploit corpora in ways that were impossible in the past.”

Eventually through all this sorting and understanding of the corpus process we should be able to weed through all the non-useful junk that might come up when we search and get a better understanding of the actual subject matter. The article also helps understand the purpose of the electronic corpus, by highlighting its non-human abilities to conduct a speed search that a person could never do in the same amount of time.

Corpus Linguistics is important because it opens up the world of information and makes it instantly available to us. It is also important to understand how these patterns work in order to articulate ourselves correctly on the web/.

Leave a Reply