Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

Cookie Cutting

News Computer Security Recommended Links Privacy is Dead – Get Over It Cyberstalking Total control: keywords in your posts that might trigger surveillance Spyware defence strategy  
Facebook as Giant Database about Users Blocking Facebook Cookie Cutting Phishing Malware Spyware Humor Etc

Cookies are a method for maintaining state on the Web. "State" in this case refers to an application's ability to work interactively with a user, remembering all data since the application started, and differentiating between users and their individual data sets. HTTTP by design is stateless protocol.

Cookie is kind of a ticket using which web site can store some data at user computer and then on a later date retrieve those data. Web site that sends cookie can't control how it will be stored (in a separate file or as part of cookies database), just the content and expiration date.

Initially cookie is a text-only string that gets is stored in the special area of memory of your browser. If the lifetime of this value is set to be longer than the time you spend at that site, then when you close the browser this string is saved to file for future reference (that's why in order to eliminate cookie you first need to close the browser, otherwise cookie might be recreated). 

Paul Bonner for Builder.Com on 11/18/1997 attributed the extension of HTTP that enabled creation of cookies to Lou Montulli, who was the protocols manager in Netscape's client product division:

"Lou Montulli, currently the protocols manager in Netscape's client product division, wrote the cookies specification for Navigator 1.0, the first browser to use the technology. Montulli says there's nothing particularly amusing about the origin of the name: 'A cookie is a well-known computer science term that is used when describing an opaque piece of data held by an intermediary. The term fits the usage precisely; it's just not a well-known term outside of computer science circles.'"

The initial idea was to enrich HTTP protocol (which is stateless) with some state capabilities. Among them is the site designer capability to track both the user credentials (enabling authomatic login) and how it uses the site(statistics, favorite pages, total number of pages browed and other metrics). 

The cookie mechanism has some limitations

During a browsing session cookies in memory, but when the browser is exited cookies are written into a file. It can be a single large file like in Netscape or one file per cookie like in IE.

Abuse of cookies mechanism

There are several reasons for web sites to abuse cookies. These range from the ability to track user behavior (which was the intent and is not stricty skeaking an abu) to stealing cookies. That also include ability to store "third party cookies", cookies belonging not to the original web site user access, but to elements of the web page from third party content providers and first of all advertizers. This way cookies can also be used for collecting demographic information (DoubleClick pioneered this usage).

Cookies also provide programmers with a quick and convenient means of keeping site content fresh and relevant to the user's interests. The newest servers use cookies to help with back-end interaction as well, which can improve the utility of a site by being able to securely store any personal data that the user has shared with a site (to help with quick logins on your favorite sites, for example).

Whether you use Internet Explorer or Netscape, your cookies are saved to a simple text file that you can delete as you please. You need to close your browser first. This is because all your cookies are held in memory until you close your browser. So, if you delete the file with your browser open, it will make a new file when you close it, and your cookies will be back.

Remember that deleting your cookie file entirely will cause you to "start from scratch" with every web site you usually visit. So, it is preferable to open the cookies.txt file (in the case of Netscape) and remove only the entries you don't like, or go to the cookies folder (in the case of IE) and delete the files matching servers you don't want.

Both Internet Explorer and Netscape allow some level of cookie verification. They both have menu options that allow you to accept all, some, or none of your incoming cookies. In addition, the "warn before accepting" feature is present in both, if you want to screen your incoming cookies.

In Netscape, go to the Edit/Preferences/Advanced menu. Your cookie choices can be changed there.

Microsoft has changed their approach to cookies over the last 3 versions of their browser. This is a reflection of how cookies have been thrust into the limelight of privacy on the Internet:

Once a cookie is rejected, it is thrown out and not saved to memory or disk. Don't forget, though, that servers will keep looking for the cookie even if you have discarded it and may try to replace it as you surf around.
Removing coocie is not the way to tell the server to not send cookies, it can't remember to not send you any cookies the next time!

A cookie is a simple piece of text. It is not an executable program, or a plug-in. It cannot be used as a virus, and it cannot access your hard drive. 

RFC 2109 limitations on your total cookie count to 300 (this includes a limit of 20 cookies per individual domain). If you exceed this, the browser will discard your least-used cookies to make room for the new ones.

Microsoft saves cookies into the "Temporary Internet Files" folder, a system folder that you can set the maximum size of (the default is 2% of your hard drive).

In any event, remember that most cookie files are 4KB or smaller, so you would need about a million cookies to occupy 4GB of drive space. 

Cookies as a Threat to Privacy

Revealing any kind of personal information to third parties opens the door for that information to be spread.

Consider the growing trend of technology conveniences in our lives. We use "frequent buyer" cards at supermarkets and gas stations. We place electronic tags on our cars to pay tolls faster and easier. We let banks pay our bills for us automatically each month without checks.

While each of these technologies (and others like them) have made our lives more convenient, each time lose some bits of privacy. Stores now know what foods you eat. Gas stations know how much you spend on gas per fill-up. Port autority ot other company knows exactly when you paied each toll. Banks know way to much about you as they aggregate all the information from your credit card purchases.

It's the same with cookies. It is a technology that permit you doing more with HTTP protocolat the expence of your privacy. But if you think that this is it the source of almost complete loss of privacy on eh WEB, you should re-examine you membership in Facebook, Gmail, Hotmail, first and get rid of your credit cards second (cash is the king and it is definitely king of privacy ;-)

Some sites will not work unless you accept thier cookies

There are three likely possibilities for problems like this. Firstly, the site you are visiting may be detecting cookies improperly. As a result, it may appear to the site that you are rejecting cookies when in fact you are not.

Another possibility is that you may be running software that interferes with cookie usage. There are many filtering and blocking software packages available for Internet users these days, and many of them also filter cookies. If you are running software like this, then your computer may not receive or send cookies. This will cause sites you visit to assume you are not accepting cookies.

Finally, your machine may be behind a firewall or proxy server that prevents cookie transmission. This is most likely in a corporate environment. So, regardless of how your browser is set, cookies won't be sent or received by your browser. Since the cookies aren't making it through to your browser, the Web Site will assume you personally aren't accepting them.

I deleted my cookies, and I can't log-on to my favorite site

Many sites use a cookie to keep track of your settings on their servers, and to help you log in to their site. If you lose your cookie, that site cannot recall your settings for you to use.

If this happens to you, the best thing you can do is contact that site's webmaster or customer service department.

Doubleclick trick

A server cannot set a cookie for a domain that it isn't a member of. However, almost every Web user has gotten a cookie from "ad.doubleclick.net" at one time or another, without ever visiting there. the solution is to include at the web page elements from other other domains. Each such element is downloaded via separate HTTP requrest able to set cookie. 

Most websites use central providers for advertisements and they are imbedded in the web page. Media service that places those ads for them became centralised information collectors for user information even if user never ever visit thir web site

. When a page is requested, it is assembled through many HTTP requests by the browser. First, there is a request for the HTML itself. Then, everything the HTML needs is requested, including images, sounds, and plug-ins.  On such element  can be from Google, other from Facebook, yet another from advertizers. Each can set separate cookie. Or, if the cookie is present they can check all the data stored. The key idea here is that people who browse Web are bobmarded by cookies from those information collection agencies. 

This usage of cookies is the most controversial, and has led to the polarized opinions on cookies, privacy, and the Internet.

How does a cookie is transmitted and retrieved from the computer

Understanding how cookies really work requires an understanding of how HTTP works. Cookies transport from Server to Client and back as an HTTP header. The specifications for this header are explicitly laid out in RFC 2109.

When a cookie is sent from the server to the browser, an additional line is added to the HTTP headers (example):

Content-type: text/html
Set-Cookie: foo=bar; path=/; expires Mon, 09-Dec-2002 13:46:00 GMT

This header entry would result in the creation of a cookie named foo. The value of foo is bar. In addition, this cookie has a path of /, meaning that it is valid for the entire site, and it has an expiration date of Dec 9, 2002 at 1:46pm Greenwich Mean Time (or Universal Time). Provided the browser can understand this header, the cookie will be set.

When a cookie is sent from the browser to the server, the cookie header is changed slightly:

Content-type: text/html
Cookie: foo=bar

Here, the server is made aware of a cookie called foo, whose value is bar.

Structure of the cookie

Cookies are set using a Set-cookie: HTTP header with 5 possible fields separated with a semicolon and a space. These fields are: 

 

How cookies are stored on my computer

After a cookie is transmitted through an HTTP header, it is stored in the memory of your browser. This way the information is quickly and readily available without re-transmission. As we have seen, however, it is possible for the lifetime of a cookie to greatly exceed the amount of time the browser will be open.

In such cases, the browser must have a way of saving the cookie when you are not browsing, or when your computer is shut off. The only way the browser can do this is to move the cookies from memory into the hard drive. This way, when you start your browser a few days later, you still have the cookies you had previously.

The browser is constantly performing maintenance on its cookies. Every time you open your browser, your cookies are read in from disk, and every time you close your browser, your cookies are re-saved to disk. If a cookie expires, it is discarded from memory and it is no longer saved to the hard drive.

How to decode the content of the cookie

The layout of Netscape's cookies.txt file is such that each line contains one name-value pair. An example cookies.txt file may have an entry that looks like this:

.netscape.com     TRUE   /  FALSE  946684799   NETSCAPE_ID  100103

Each line represents a single piece of stored information. A tab is inserted between each of the fields.

From left-to-right, here is what each field represents:

Where does IE keep its cookies?

Microsoft IE keeps its cookies in different locations, depending on the version of  IE and version of Windows you are using. For Windows 7 this is C:\Users\user_name\Cookies

The best way to find it is to use the Windows "Search" feature an look for the "Cookies" folder.

Although the location may be different, the format is the same. IE 9 generate random filenames for cookies:

-rwx------+ 1 Administrators Domain Users   1457 Jul 26 10:20 86LQA4ZG.txt*
-rwx------+ 1 Administrators Domain Users     87 Jun 26 09:30 8C3QOADI.txt*
-rwx------+ 1 Administrators Domain Users    349 Jul 31 10:13 GWPL3J3E.txt*
-rwx------+ 1 Administrators Domain Users 278528 Aug 17 16:56 index.dat*
drwx------+ 1 Administrators Domain Users      0 May 18 15:17 Low/
-rwx------+ 1 Administrators Domain Users    504 Jun 26 10:25 MK9XW9P5.txt*
-rwx------+ 1 Administrators Domain Users    153 Jul 17 09:34 TYH4YP57.txt*

The first cookie from this list is a cookie from Yahoo

B
5eflsa57u6r3e&b=4&d=SSXi2adpYEJwYa3rEw_N8XlGMocFAHFubsJ2Nw--&s=gn&i=bIonIkUiMMsGYBVbXAmp
yahoo.com/
1024
4158268288
30386597
3691956040
30239545
*
F
a=x1H12o8MvTaqze5t2GLO2bJNu65JoQQPI99ZZUbGRUWllK_YsqQn63RihEXzLzgyHmtZdyY-&b=dy7n
yahoo.com/
1024
4158268288
30386597
3691966040
30239545
*
YLS
v=1&p=1&n=9
yahoo.com/
1088
546742272
32055918
2426129003
30232546
*
Y
v=1&n=18q7ejeuth5l9&l=d8a14p/o&p=m1i07kk0538a1000&jb=16|31|12&r=5f&lg=en-US&intl=us
yahoo.com/
1024
4158268288
30386597
3691976040
30239545
*
PH
l=en-US&i=us&fn=xoRFzyg0OLapNeq7sd6W9g--
yahoo.com/
1024
4158268288
30386597
3691986040
30239545
*
T
z=DJVEQBDd8IQBDyuybNh.Mz4NjYzMgY1Mk9OMjFOTzYzTzc2MD&a=YAE&sk=DAAu06tfxIk5K6&ks=EAAcO025KVrpYK03OlieHjs_g--~E&d=c2wBTVRFME5RRXlOVGc1TlRZNU9ERTBPREF4TnpJMk1nLS0BYQFZQUUBZwFaWlRUQVZZUlE3UUZYRVlOREVET1dBRTY1SQFvawFaVzAtAXRpcAFRMlJnVUEBenoBREpWRVFCQTdFAXNjaWQBUGtYVzBuSWN2bkwwTWtLdFJXSVZhSUhWRVBzLQ--
yahoo.com/
1024
4158268288
30386597
3691996040
30239545
*
C
mg=1
yahoo.com/
1024
3144372992
30379598
2555836422
30232546
*
SSL
v=1&s=M80QHYHEml2iQQJbY8ule2sodCUbK3tWbAe29PlS7wqCCRD.vv6E6x5OnE8Nkq.igu7uvgdpO5M1Zhs45MFZaQ--&kv=0
yahoo.com/
9217
4158268288
30386597
3692006040
30239545
*
ucs
hs=5
yahoo.com/
1088
3575351296
31155374
2532027668
30237538
*
MSC
t=1342450330X
yahoo.com/
1024
209766656
30310964
2485122978
30237538
*
CH
AgBQBCoQADXvEAAn9hAAPt8QACcjEAAJ+BAAEQwQABwMEAACmhAABWQQADkd
yahoo.com/
1024
2350811392
30243573
2485142980
30237538
*

Creating a Cookie Value

Creating a cookie generally involves duplicating the HTTP cookie header in some fashion so that the browser will store the name-value pair in memory. Some languages expect an exact HTTP header to be sent, while others will use built-in functions to help you speed the process along.

Cookies can be set from the browser-side or from the server-side. The determining factor will be the language you use to create the cookie. Once the cookie is created, it should flow easily from server to client and back via the HTTP headers.

There are limits on the contents of both the cookie string and the cookie file. These limits are partially imposed by HTTP and partially by arbitrary choice. They are as follows:

  1. You CANNOT set Cookies for domains other than those that your response originates from. That is, a page on www.myserver.com can set a Cookie for myserver.com and www.myserver.com, but NOT www.yourserver.com.
  2. The cookie HTTP header must be no more than 4K in size.

Note that this applies to cookies while they are in memory or stored in the cookies.txt file.

 Retrieving a Cookie Value

For the most part, retrieving cookies does not require reading the HTTP Cookie: header. Most languages read this header for you and make it accessible through a variable or object.

Cookies can be read on the browser side or the server side. Again, the determining factor is the language used.

The main limit on retrieving a cookie is that you can only retrieve cookies that are valid for the document your script resides in. That is, a script on www.myserver.com cannot read cookies from www.yourserver.com. This is also true for subdirectories within your site. A cookie valid for /dirOne cannot be read by a script in /dirTwo. This is mainly governed on the browser side, as browsers know the URL that they are accessing, and only transmit cookies for that server across the connection.

 Clearing a Cookie Value

When programming a Web site, there are many reasons that you may need to erase a cookie you have created. Often it is because the cookie is no longer needed, or the scheme of your cookie has been altered, and requires resetting.

The two main steps to clearing a cookie you have created are:

  1. Set the cookie's value to null.
  2. Set the cookie's expiration date to some time in the past.

The reason you must do both is that simply setting the expiration to a past time will not change it's value until the browser is closed. That is, all cookie names, values, expirations, etc are resolved once the browser program has been closed. Setting the cookie to null allows you to properly test for the cookie until that resolution.

Detecting if cookies are accepted

To properly detect if a cookie is being accepted via the server, the cookie needs to be set on one HTTP request and read back in another. This cannot be accomplished within 1 request. When using PERL or ASP, try to funnel your visitors through a common page where you can set a test cookie. Then, when the time comes to detect, check for that cookie.

If you use client-side languages to set a cookie, you can set and read on the same page. Cookies set by JavaScript or VBScript reside in the browser's memory already, so you will know if they have been accepted right away. Check by setting a test value, and then try to read that value back out of the cookie. If the value still exists, the cookie was accepted.

Compact Privacy Policies and IE6

In 1998, the W3C started drafting a proposal for a Platform for Privacy Preferences (P3P). P3P has 3 main goals (courtesy of the W3C):

Now an official specification, P3Ps use an XML file to describe in as much detail as possible how a web site uses personal data during and after a user's session. This can include the intended usage of cookies to hold or refer to such information.

Alternatively, a site can create a P3P policy that refers solely to its cookie use. These Compact Privacy Policies are a focal point in Microsoft's new strategy in addressing the cookie "problem."

Users of Internet Explorer 6 can set their Privacy preferences based upon whether the target site has a Privacy Policy or not. If a site does not have a policy, its cookies may be automatically rejected by IE, and the user will see an icon on the status bar indicating a conflict with the user's privacy preferences.

P3P may have a broad impact on cookies and their future use. Especially in the context of advertising and commerce. Even though compact policies are essentially straightforward to create, users still stand to regain a great deal of control over their browser's communications.


  • NEWS CONTENTS

    Old News ;-)

    Spyware Heats Up the Debate Over Cookies By BOB TEDESCHI

    Aug 15, 2005 | NYT

    INTERNET users are taking back control of their computers, and online marketers and publishers are not pleased with the results. But they don't quite know what to do about their conundrum - if it is a conundrum, since they can't even agree on that.

    Until recently, Internet businesses could track their users freely, using what are known as cookies, tiny text files they embed on the user's hard drive. Now, with the proliferation of antispyware programs that can delete unwanted cookies, they often cannot tell who has been to their Web site before or what they have seen. And this erosion of control over a tool for gaining insight into consumer behavior has many of them fretting.

    "Cookies are critical from a business perspective," said Lorraine Ross, vice president for sales at USAToday.com. "They help us do things like track our profitability per unique visitor, for instance. But if you don't know how many people are coming in, you don't really have a handle on whether your profitability is improving or not."

    It isn't necessarily just corporate America that is threatened by the anticookie fervor, Ms. Ross said - the deleters stand to suffer, too. For example, cookies help a computer limit how many times a user sees annoying ads like a floating, animated message. Such "frequency caps," to use industry parlance, are common among publishers. "So cookies are a really good thing for managing the user's experience," she said.

    Last year, though, Ms. Ross said executives at the company debated how effective their frequency limits were, since a growing number of Internet users were deleting cookies and possibly seeing lots of animated ads.

    Ms. Ross said that like most established companies, USAToday.com did not use its cookies to identify its users. "But the user's paranoia is understandable, given the history," she said.

    Cookies first got a bad name in 1999, when DoubleClick announced that it would use them to identify Internet users and analyze both their offline purchasing patterns and online surfing habits for the purpose of showing them more relevant online ads. That plan died a loud, painful death after privacy advocates objected strenuously, and marketers and publishers have since taken a much more cautious approach.

    Even so, privacy advocates deplore cookies and, as software programs like Webroot Spy Sweeper and McAfee AntiSpyware have come on the market, surfers by the millions are apparently knocking the cookies out of service as fast as the programs can be installed. This spring, the online consulting firm Jupiter Research published a report saying that nearly 40 percent of Internet users surveyed regularly erased them.

    "I don't think cookies should be out there at all," said Marc Rotenberg, executive director of the Electronic Privacy Information Center, an advocacy group based in Washington, "but the good news here is that consumers are at least becoming more sophisticated about the appropriate use of cookies."

    Eric Peterson, the analyst who wrote the Jupiter report, pointed out that most of the deleted files were so-called third-party cookies placed on the computer by a company other than the one operating the site the user was visiting. Most publishers rely on outside companies like DoubleClick and Atlas to send ads to the user's computer and track the effectiveness of campaigns.

    Antispyware programs often leave in place first-party cookies, which can save users the inconvenience of having to log in to a news site each time they visit, but remove third-party cookies, the main target of users' ire. Some people say they think that total anonymity is the way to go.

    The threat to the bottom line is real. Mr. Peterson said cookies not only helped sites measure overall profitability, but were critical in measuring the effectiveness of individual advertising campaigns. Marketers, for instance, could conceivably pay a Web site to deliver ads to 100,000 people, but only reach about 60,000 because so many of them were being counted twice.

    "If you're O.K. with getting your ads to half as many people, and not really being sure how effective your campaign was, well then you can happily put your head in the sand," Mr. Peterson said. "Most people tell us they want data more accurate than that."

    But are that many people really blocking cookies? Some executives aren't so sure. "When I talk to publishers, nobody says the problem is as big as the press suggests," said Greg Stuart, chief executive of the Interactive Advertising Bureau, an industry trade group. "So our role should be to get to some factual basis."

    Mr. Stuart said his organization was planning its own research into the issue because, he said, much of the recent research "involves asking consumers about what they did, which isn't always a good indicator of their behavior."

    Another doubter is Peter Naylor, senior vice president for sales at iVillage, a network of women's sites. "I don't think the problem is real, based on what we're seeing, or more importantly not seeing," he said.

    Mr. Naylor said he had not conducted tests or surveys to determine if his company's visitors were deleting or blocking cookies, "but nothing has changed dramatically enough to raise a red flag."

    "And I've heard literally nothing about it from advertisers," he added.

    Among those companies fielding the most calls about cookie deletion are advertising technology businesses like Atlas. Young-Bean Song, the director of analytics for Atlas, said that even if the cookie deletion rates were as high as 40 percent, publishers and marketers could still rely on the data from the 60 percent remaining of a site's users to gauge the effectiveness of their advertising campaigns and other important statistics.

    Perhaps because executives cannot agree on the scope of the problem, solutions have been slow to emerge. Mr. Stuart of the Interactive Advertising Bureau said that if the issue turned out to be as big as some suspect, his organization was likely to embark on an ad campaign to convince online users that cookies were not harmful.

    Already, Internet companies are trying to accommodate Web users in practical ways. In May, WebTrends, a site that had helped online businesses analyze advertising and Web site data by using third-party cookies, began offering its clients the ability to offer first-party cookies without losing the data associated with the old third-party ones. Greg Drew, WebTrends' chief executive, said some users still blocked cookies altogether, so the solution was not completely effective. Still, he said, many clients had flocked to the service.

    In the meantime, Ms. Ross of USAToday.com said the real solution was to overcome consumer hostility toward what she regards as a legitimate business practice that makes life easier for everyone. That may be a long shot, but she is hopeful.

    "We have to think about long-term answers," she said. "We need to have users love their cookies, for the right reasons."


    Recommended Links

    Google matched content

    Softpanorama Recommended

    Top articles

    Sites

    Cookie Central is dedicated to answering questions about cookies. Feel free to look around.

    There's a great article concerning cookies on Marshall Brain's "How Stuff Works". It goes even deeper than this FAQ does, especially in the realm of public opinion. Worth a look!

    The World Wide Web Consortium has an excellent FAQ to answer the majority of Internet and Web-related questions. You can read their topic: "Do 'Cookies' Pose any Security Risks?"

    In addition, there are an abundance of resources on the Internet that can help you find answers to your cookie questions. Conveniently, Yahoo has a great listing of them. I encourage you to stop by and check the list out!