The banner displayed above this sentence is a "Google AdSense" advertising
Those Google programs can "mark" each visiting browser with a unique cookie stored in it. Upon each browsing of such a Google-instrumented web "page" the triggered Google program can read and write cookies, then record in one of Google's databases that "the browser storing this cookie browsed this URL, at this date". This enables Google to track the visitor, to know about his browsing habits (which 'pages' did he read? when? how long? on which links did he click? where is the visitor (various commercial services map many IP addresses to countries/cities)...). On many cases Google even has the visitor's identity because he registered on some Google offering (thus enriching the informations in Google databases associated to his/her Google 'cookie'). Even without being able to obtain the visitor's identity, Google knows that "the browser used to view this site on this date was also used to view this other one".
This tracking is efficient because the typical Web user uses "his own browser" (his own set of browser configuration and data files, actually). In other words: very few persons "share" a single browser setup or often purge the cookies stored in their browsers (because it is somewhat inconvenient as, for example, it forbids accessing to some websites).
Many visitor-tracking programs, on many websites, do track visitors. But no one seems to be as pervasive as Google's.
Google mainly earns money thanks to their AdWords offering (selling ad banners), which uses AdSense. For now AdSense populates an ad banner by links to sites which use terms also used in the web "page" where the banner is to be displayed. The reasoning is "if the visitor landed here, the very topic of this document is of interest to him, therefore any related advertising is pertinent".
In order to accurately choose banners, Google advertising software may gain from quickly knowing the somewhat changing preferred topics of any visitor, along with their relative importances.
How can Google gain more intelligence, faster, on a large portion of all Web users of all types?
The most prominent one is Wikipedia which, albeit being far from perfect (beware: document written in French), attracts experts (wanting to check any information published), vendors (searching for hints and promotion) and Web users (looking for information) altogether.
Google can not hope that Wikipedia web servers administrators will let them track the visitors but knows that this can be done without any intrusive action by hosting it (meaning: letting the website run on Google's resources, by providing machines, network connection and some human work), enabling them to listen to all network traffic. Google will be able to collect various information (visitors IP and mail addresses, browsers characteristics, Wikipedia cookies...), very useful to track the visitors and both "leveraged by" and "leveraging" similar information collected by other means, enabling Google, for example, to enhance visitor's identification. This is of interest even if Google hosts a small subset of Wikipedia webservers and proxy cache servers (because of the round-robbin setup), and doing it adequately can lead it to manage all of them, thus obtaining all visitors information.
The Google indexer (search service) already masters a fair part (probably at least than 30%) of Wikipedia incoming traffic (this leadership was clear in 2002 and there is no reason to think that this trend ended), but it simply cannot fetch intelligence on what a visitor does as soon as he interacts with Wikipedia's interface. Worse: Google cannot know even on which article a visitor lands either:
Google cannot hope tracking this through AdSense because most contributors don't want advertising. Analytics may do the job but Wikipedia admins don't seem to be interested.
By managing to host Wikipedia, Google will gain a good insight (first-hand and in real time) into many visitors topics of interest, because Google will know who reads or writes articles published by this encyclopedia. Therefore this company will gain a competitive advantage by more accurately selecting AdSense and AdWords advertisements. By learning that you browse Wikipedia among articles about "solar cells", for example, Google will immediately infer that you may be in the process of collecting information before buying one.
For example: some categories used in Wikipedia articles reflect families of commercial products. Therefore the subset of categories used in most articles successively read by a given visitor in a short period... will help to target ads towards products related to his current need.
In a not-so-distant future the very fact of qualifying what is of interest for a given visitor may be a major asset, especially if he is also identified. Efficient ways to circumvent such maneuvers, for example using Tor, will then gain momentum. By hosting Wikipedia Google will nearly immediately gain a way to do such magic, for a huge and growing proportion of Web users. It will adequately be considered by much as community-friendly move, ensuring a comfortable marketing flash.
On the technical side of things, hosting will also enable Google to better and faster index Wikipedia, to easily consolidate various sources of tracking information because (as far as I know) many user accounts on Wikipedia are created by giving a GMail address, to attract more incoming traffic on its network (think peering and projects directed towards the client machine...), to peek into the effect (on visitor's behavior) of some semantic Web, to spare their resources used to cache Wikipedia...
Google may offer to host Wikipedia, giving a real gift and simultaneously earning from it.
Google offered to host the project (without any modification nor advertising insertion!) just after a major hiccup, in 2004: its founders met Jimbo Wales, Wikipedia co-founder. The news were then, slightly edited (spelling):
Jimmy Wales --Wikipedia founder-- met with Sergei Brin and Larry Page, who were extremely enthusiastic about the whole project. Google plans to donate a certain number of 'Dual Xeon' servers at one or more of their data centers and with unlimited bandwidth. In some days Wikimedia Foundation Board is to discuss --via IRC-- the agreement with Google, who aren't to ask anything in return (ads on Wikipedia pages).
Wikipedia needs resources and Google wants to offer it... but the deal languished 'till 2005 and is not closed (May, 2007), more than 3 years after the first official meeting. Yahoo and Kennisnet offered partial hosting (in Korea and the Netherlands), which is in action.
Jimbo Wales probably knows that a Google deal may critically hurt some of his projects (millions of bucks to establish a rival to Google by applying the Open Source and transparent ideals of Wikipedia to a search project, using Grub (which in fact seems to be some Nutch hack(?)), a Google-clone in its fundamentals which harnesses some resources of the volunteers own machines. If this project succeeds it will create a competitor for the search service proposed by Google.
Wales' plan may be to let Google think to Wikia as a target, as a community and technology potentially dangerous and in any case useful to acquire. He may work in order to discourage the Wikimedia Foundation from accepting Google's hosting, with persuasive arguments: linking the future of Wikipedia to a commercial entity is dangerous because it merges their fates and their shareholders can go berserk, moreover no one can hope salvage such a hosting solution many years after having it delegated...
Given the state of Wikipedia budget (expenses soon to reach 3.3 millions USD per year and revenues of 1.6 millions USD in 2006), however, one may think that such a deal will be hard to dodge.
Google answer to Wales is surprising: Google-search results now often begin with a link to a Wikipedia article. This is a way to have Wikipedia users prefer, for their usual quick-searches, Google search over the Wikipedia's integrated search, because most will understand that a Google search has a richer set of results while allowing quick access to Wikipedia. This enables Google to somewhat track Wikipedia's users behavior while raising its relative importance as a referer to Wikipedia.
Google announced (December 2007) a Wikipedia competitor dubbed Knol. If Google cannot host Wikipedia, let's bet that the links to Wikipedia provided in most Google-search results will be replaced by links to Knol.
Google is in the process of consolidating (Single Sign On) the mess of user accounts historically created by each of its services, thus simplifying the tracking process.
'Google Analytics' even enables Google to know what a given tracked visitor publishes on the Web.
Cory Doctorow wrote a piece about "what if google were evil?": Scroogled.