Sarah sends a comment which asks about online contextual advertising. She asks: Why are those snappy short ads working so well in certain cases, and not at all in others? And is this influencing other parts of our language use at all?
This is in fact an area of applied internet linguistics which I've spent a lot of time on in the last ten years. You can read up more about it at the site www.crystalsemantics.com - but essentially my procedure avoids the problem you've noticed by providing a full lexical specification of the content of a page. To see why this is needed, consider the following example.
A few years ago there was a page on CNN reporting a street stabbing in Chicago. The ads down the side said such things as 'Buy the best knives here', 'Get knives on eBay', and so on! The stupid software had found the word 'knife' and assumed that the page was about knives, and automatically assigned cutlery ads to it. No-one was happy, least of all the cutlery firms, who certainly didn't want their product to be associated with homicides.
To avoid this crazy result, my approach analyses all the content words (strictly, 'lexemes') on the page and weights them in terms of relevance. For the CNN page, a word like 'knife' would be outranked by the cluster of other words on the page that relate to crime. It then classifies the page using a set of around 1500 categories derived from the taxonomy I developed when working on the Cambridge encyclopedia family (there's an earlier post about that). It would conclude that this is a page about a crime - specifically, a homicide. It might also conclude that it was about some other things, too, such as policing or urban renewal. (Web pages are usually multi-thematic.)
Any advertiser wanting to place an ad alongside this report would want it to be relevant to the page - an ad about crime prevention, say, or careers in the police force. All advertisers have to do is apply the same classification system to their ad portfolio, and the software picks out the relevant ads. It's a simple principle, but it works very well, and is now beginning to be widely used by the company that is now developing it, adpepper media.
The principle is simple, but the linguistics took a long time to develop - ten years, in fact. Every sense of every content word in a college-sized English dictionary had to be investigated and assigned to the relevant encyclopedic category, and significant collocations also had to be identified. The initial task took a team of lexicographers several years, and the software engineering took another team several years more. Indeed, the refining of the approach is still going on, to make sure it is fast enough and robust enough to cope with commercial demands, which might run to hundreds of millions of page-analyses and ad-assignments a day.
Incidentally, the same procedure can be used for other internet applications, such as improving search-engine relevance, automatic document classification, and internet security. It's difficult to get the big firms and organizations to run with these new ideas, though, I find. They are very set in their ways, and prefer to carry on using their familiar methods (even if they don't work that well) than to invest in new strategies. For instance, a couple of years ago I developed a method (called 'Chatsafe') for tracking paedophile gambits in conversations, based on this sort of lexical analysis. It worked fine, and I thought it would be welcomed by the Powers That Be concerned with this sort of thing, such as the Home Office or chatroom companies. But despite a lot of talk, nobody picked it up, so it's stayed on the shelf.
Is this influencing language use in general? I don't see much sign of that. I talk about the extent to which the internet is influencing current usage in my Language and the Internet, and also in A Glossary of Textspeak and Netspeak. Although the internet is linguistically revolutionary in certain respects, the impact it has so far had on actual usage in a language is pretty limited.