Spelling

Google Search is a writer’s friend: Primo spell checker

segg.png

For years now I’ve used Google Search as my go-to-spell checker on the internet for words that stump Microsoft Word’s spell checker (which is unfortunately a pretty low bar … “no spelling suggestions” and red underlined words are a pretty common occurrence. I may get one yet writing this sentence).

A spell checker is an application program that flags words in a document that may not be spelled correctly. Spell checkers may be stand-alone, capable of operating on a block of text, or as part of a larger application, such as a word processor, e-mail client, electronic dictionary, or search engine.

“The spell checker scans the text and extracts the words contained in it, comparing each word with a known list of correctly spelled words (i.e. a dictionary). This might contain just a list of words, or it might also contain additional information, such as hyphenation points or lexical and grammatical attributes,” Wikipedia tells me.

“An additional step is a language-dependent algorithm for handling morphology. Even for a lightly inflected language like English, the spell-checker will need to consider different forms of the same word, such as plurals, verbal forms, contractions, and possessives. For many other languages, such as those featuring agglutination and more complex declension and conjugation, this part of the process is more complicated.”

Most of the time I do know how to spell the word triggering the red alert, but even my largely two-index fingers typing has a tendency to overrun my typing on the page when I am composing something quickly in my head, as I write (err … type), and sometimes it is just as fast, when it is more than one word, to copy-and-paste the sentence into Google Search rather than to individually correct several suspect words. Sometimes, of course, I correct the word in Word just to make sure I really do remember how to spell it. Sort of like doing math in your head, or at least on paper with a pen or pencil, rather than using a calculator. We pretty much all figure we should be able to do those things manually; we just don’t want to overdo it.

This got me thinking the other day, wondering why Google Search is so much better at correcting my spelling in sentences, almost as an afterthought, while it completes a search that may or may not be additionally helpful in and of itself. Google Search will often finish a sentence correctly for me, even if I only paste or type a part of the sentence into the search box or bar.

My first hunch was that it had something to do with the vast amount of data Google Search processes with over three billion searches a day, and developing algorithms and other proprietary tools based on that.

My second hunch was that if I was pondering this other people have thought about it, researched it, and likely written about it before me.

My intuition for both hunches turned out to be correct.

Intuition, in fact, is what Google Search is all about. What makes it intuitive? Context. Context rules.

John Breeden II, the Washington, D.C. chief executive officer of Tech Writers Bureau, who formerly was the laboratory director and senior technology analyst for Government Computer News (GCN), where he reviewed thousands of products aimed at the U.S. federal government – everything from notebooks to high-end servers – and at the same time decoded highly technical topics for broad audiences, wrote about the topic in an Nov. 18, 2011 article for GCN.

“My biggest problem with Word is that there are some words that simply trip it up,” Breeden wrote. “When writing about temperature for our many rugged reviews, I always put ‘Farenheight,’ which Word thinks should be changed to ‘Fare height.’ That doesn’t help at all.

“However, when the same misspelled word is pasted into Google, it says, ‘showing results for Fahrenheit instead.’ There are quite a few other words that confuse Word but not Google. They are not difficult to find.

“I have to wonder why Google is so smart when it comes to figuring out what word a user wants to use. My guess is that the database Google is pulling from is so massive that it’s probably seen a lot of the same basic spelling mistakes. There are probably a lot of people who have wanted to search for Fahrenheit but typed in ‘Farenheight’ instead. Nice to know that I’ve got company.

“You would think it would be simple for word processors to use the same type of technology to improve their accuracy, but I suppose that would involve capturing data from their users and then making the connections between common mistakes and the accurate spelling.

“I thought that is what spell check was supposed to do, but instead I think it just matches the misspelling with words that are somewhat close to what you’ve typed. And Google obviously goes beyond that to associate common mistakes with actual words.”

An anonymous poster at Quora, a question-and-answer website where questions are asked, answered, edited and organized by its community of users, wrote on Sept. 1, 2012 in response to the question, “How is google so good at correcting spelling mistakes in searches?”:

“Google (search engines in general) has clusters processing tons (TB’s) query logs, which try to learn the transformation from original misspelled sentence to the corrected one. These transformation schemes are fed into the front end servers which serve the auto completion (and/or corrections to queries). “Also these servers have lot more processing power and memory and disk space of course will not be an issue at all (for the learned transformations). “Also since Google crawls the entire web regularly it will learn new words and suggest corrections Word can’t do till next release.”

Quora also aggregates questions and answers to topics.

“Desktop software usually have tight constraints on processing power, memory or disk space they could use to run compared to that of server based applications and usually are expected to keep the internet usage to a minimum (at least for MS Word.) “They use static resources (dictionary that might only be current at the time of launch) and can’t employ complex algorithms due to the above said restrictions and hence employ heuristic algorithms which may not [be] very predictive of the correct word.”

Cosmin Negruseri, vice-president of engineering at Addepar, an investment management technology company, formerly worked at Google (both companies are based in Mountain View in Santa Clara County, California) as an engineer, working on ads, search and Google Code Jam, an international programming competition hosted and administered by Google, replied the same day, writing: “The main insight in modern spell correctors is using context. For example New Yorp is a misspelling of New York with a high probability.”

 You can also follow me on Twitter at: https://twitter.com/jwbarker22

 

 

Standard
Education

High school redux

Dwyer 50th Logologo1

Being a Catholic high school graduate wasn’t high on the list of things top of mind when I moved to Manitoba in 2007. That’s mainly because my high school days were some 30 years behind me – or at least so I thought at the time.

Turns out, however, Sister Andrea Dumont, the longest-serving religious in Thompson, is originally from St. Catharines, Ontario and a member of the Congregation of the Sisters of St. Joseph of Toronto, who – wait for it – just happen to be the same sisters who taught some of my classes from September 1971 to June 1976 when Sister Conrad Lauber was principal and Sister Dorothy Schweitzer taught me several English classes – and Grade 10 general math at Oshawa Catholic High School (previously known as St. Joseph’s High School and later Monsignor Paul Dwyer Catholic High School.) Sister Dorothy also taught high school in Toronto, Vancouver and Edmonton, as well as Oshawa.

Trying to teach me high school math must have given real meaning to terms like “long suffering” and “patience of a saint.” As I recall, there were two mathematics “streams” back then: “advanced” and “general.” Since these were in the days before there was much articulation of the concept of “bullying,” many of your classmates had no reservation about saying that “general” math was for “dummies” or “dunces.” Self-esteem aside, I’d have been hard-pressed to argue the point, especially since I struggled with math no matter what the label: algebra, geometry, functions and relations – shoot me now, just remembering the words, much less the symbols and equations. If I had known how many percentages I would have to convert as a journalist, I might have paid more attention to high school math, but perhaps not.

It was only after meeting up with Sister Andrea, who spent 14 years in Guatemala and since returning to Canada has lived in Grand Rapids, Easterville and Thompson, where the main focus of her work is in adult education, which includes training lay presiders for times when there is no priest available, organizing and instructing in the various ministries, sacramental preparation and RCIA (Rite of Christian Initiation of Adults), when I became a parishioner at St. Lawrence Catholic Church here, that I realized Sister Dorothy and Sister Conrad, more than three decades on, are still alive and active – and that Sister Andrea knows them and often sees them on visits home to Southern Ontario.

Sister Conrad Lauber, ministry director for Fontbonne Ministries’ Village Mosaic in Etobicoke, described as an “unsung hero,” was awarded the Queen Elizabeth II Diamond Jubilee medal in June 2012. Village Mosaic’s focus is always, Sister Conrad says, “about relationship building, bringing participants together to form community.”

exSr. Conrad B

I remember Sister Conrad, then my principal, sitting in her office my last year of high school, as she showed me her debating awards, after I had once again been defending some decidedly non-Catholic propositions in inter-high school debating tournaments. She got it. She understood the intellectual exercise. But unlike me at the time, she also understood more was at stake. She didn’t ask me to stop debating, but only if I could perhaps tone down some of my rhetoric a bit when representing the school in public at debates.

I had a wonderful e-mail reply from Sister Dorothy several years ago, where she said in part: “You write very well (this is your former English teacher speaking!) and astutely. And thank you for your kind words – it’s comforting to know, so many years later, that my efforts were not all in vain!”

A wonderful flash, indeed, of Sister Dorothy’s characteristic good humour, not to mention perhaps a diplomatic or discreet indirect reference to Grade 10 general math class.

For any of you reading this who may have grown up in the Durham Region of Southern Ontario, just east of Toronto, or still live there, and are interested especially in Catholic post-secondary education in the 1960s or 1970s,  Ken Bodnar’s blog called My OCHS at http://myochs.blogspot.ca/ is the first and last word on our high school days and years. Ken has it all: history, both official and unofficial, trivia, the arcane, milestones, biographical sketches and old photos from his own archive of old negatives, yearbooks and other sources. Ken is the unofficial archivist for all things relating to St. Joseph’s High School, Oshawa Catholic High School, or Monsignor Paul Dwyer Catholic High School, as students now call its hallowed halls. You can contact Ken by e-mail at: ochsblogger@rocketmail.com

Standard