the Law, the Universe, and Everything 

Search

Concurring Opinions is a
general-interest legal blog
operated by Concurring
Opinions LLC, a Pennsylvania
Limited Liability Corporation.

lr_jkr9_12_08supremecourt.jpg

ad-logo5.jpg

Our Podcast

Subscribe to Law Talk

Law-Rev-Forum-2.jpg

law-rev-contents2.jpg

Law-Prof-Blog-Census.jpg

Categories

Accounting
Administrative Announcements
Administrative Law
Admiralty
Advertising
Agricultural Law
Anonymity
Antitrust
Architecture
Articles and Books
Bankruptcy
Behavioral Law and Economics
Bioethics
Blogging
Book Reviews
Capital Punishment
Civil Procedure
Civil Rights
Conferences
Constitutional Law
Consumer Protection Law
Contract Law & Beyond
Corporate Finance
Corporate Law
Criminal Law
Criminal Procedure
Culture
Current Events
Cyberlaw
DRM
Economic Analysis of Law
Education
Empirical Analysis of Law
Employment Law
Environmental Law
Estates and Trusts
Evidence Law
Family Law
Feminism and Gender
First Amendment
Food
Google & Search Engines
Health Law
History of Law
Humor
Immigration
Insurance Law
Intellectual Property
International & Comparative Law
Interviews
Jurisprudence
Law and Humanities
Law and Inequality
Law and Psychology
Law Practice
Law Professor Blogger Census
Law Rev (Boston College)
Law Rev (Boston University)
Law Rev (California)
Law Rev (Chicago)
Law Rev (Columbia)
Law Rev (Cornell)
Law Rev (Duke)
Law Rev (Emory)
Law Rev (Fordham)
Law Rev (Georgetown)
Law Rev (GW)
Law Rev (Harvard)
Law Rev (Illinois)
Law Rev (Indiana)
Law Rev (Iowa)
Law Rev (Michigan)
Law Rev (Minnesota)
Law Rev (Northwestern)
Law Rev (Notre Dame)
Law Rev (NYU)
Law Rev (Penn)
Law Rev (S Cal)
Law Rev (Stanford)
Law Rev (Texas)
Law Rev (UCLA)
Law Rev (Vanderbilt)
Law Rev (Virginia)
Law Rev (Wash U)
Law Rev (Wm & Mary)
Law Rev (Yale)
Law Rev Contents
Law Rev Forum
Law School
Law School (Hiring & Laterals)
Law School (Law Reviews)
Law School (Rankings)
Law School (Scholarship)
Law School (Teaching)
Law Student Discussions
Law Talk
Legal Ethics
Legal Theory
Media Law
Movies & Television
Philosophy of Social Science
Politics
Privacy
Privacy (Consumer Privacy)
Privacy (Electronic Surveillance)
Privacy (Gossip & Shaming)
Privacy (ID Theft)
Privacy (Law Enforcement)
Privacy (Medical)
Privacy (National Security)
Property Law
Race
Religion
Reparations
Science Fiction
Second Amendment
Securities
Social Network Websites
Sociology of Law
Supreme Court
Tax
Teaching
Technology
Tort Law
Web 2.0
Weird
Wiki
Wills, Trusts, and Estates

Archives

October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
August 2005
July 2005
June 2005
May 2005

 

« What's On the Net Stays on the Net: Thoughts on the Wayback Machine | Main | Article III Groupie Groupie »

November 16, 2005

Does Anything Really Disappear from the Internet?

posted by Daniel J. Solove

magician1.jpgI just posted about the Wayback Machine and that got me wondering whether anything really disappears from the Internet when it is deleted. Certainly, a ton gets archived in the Wayback Machine as well as in Google cache and in RSS readers. Of course, if something appears on the Internet, somebody could see it and copy it before it gets taken down.

But I was wondering to what extent information can vanish completely from the Internet. Thus, if a blogger posts something and then deletes it a minute later, can it escape from permanent fame? Maybe some ill-fated performances might be so brief that they can sneak on and off the Internet without being caught. What about a comment to a blog post that gets zapped quickly by the blog author? Can this escape becoming part of some permanent record?

The question, put another way: Can something posted briefly on the Internet, seen and heard by hardly anyone, not snatched up by anybody, and then deleted, be gone forever? Is there an Internet equivalent to a tree falling in the forest that nobody hears?

I don't know the answer to this question, and I would like to hear from those with more technical expertise.

UPDATE: People with expertise have answered, and their replies are worth checking out if you're interested in the issue.

Posted by Daniel J. Solove at November 16, 2005 12:10 AM

Trackback Pings

TrackBack URL for this entry:
http://www.concurringopinions.com/movabletype/mt-tb.cgi/214.

Comments

Sure it can vanish... if you didn't send it anywhere (e.g. via RSS), and if no one came to read it in the interim. For unpopular unlinked sites that can be a very long time. For a site like this one...well, just hope you didn't ping any sites to come and do updates, or have the bad luck to be visited by a googlebotor other robot, not to mention a person who kept a copy). As a practical matter, how long the 'window of forgiveness' may be depends on your traffic...and luck. But sure, lots of old stuff is gone forever, and new stuff too can vanish un-archived, especially if you get to it quickly enough.



[And if your sever is on Unix, when a file is deleted/changed it is much more erased than on Windows.]

Posted by: Michael Froomkin at November 16, 2005 12:38 AM


Yes, you can stay out of the wayback machine on archive.org and out of most search engines-- you put a robot exclusion on your website. you can also just stay out of the wayback machine and in the search engines etc. search on "robot exclusion" and you will find the magic incantations.

-brewster

Posted by: brewster kahle at November 16, 2005 01:05 AM


I've actually spent time trying to run things down that went away. Especially comments on blogs that have been deleted, etc.

The archive cycle, especially a while back, was not an hourly or even a daily one.

Posted by: Stephen M (Ethesis) at November 16, 2005 07:53 AM


It always surprises me that the engines seem to obey robots.txt, but they do. I don't know why, unless they think it will insulate them from liability.

I have successfully destroyed files from the early days of the internet, but today...

Posted by: Paul Gowder at November 16, 2005 10:00 AM


I used to blog back in college before the word was invented. Had to hard-code the HTML myself. That site is mostly dust in the wind. I've only ever been able to find the splash page archived anywhere, imploring surfers to look on my works and despair.

Could a site be up for more than a year and be scoured from the net anymore? I doubt it.

Posted by: John Armstrong at November 16, 2005 11:16 AM


Google Concurring Opinions. (.) You'll notice a date on the 4th line of the first result. In green font it reads "Nov 14, 2005." That's the last time Google crawled your site. Generally Google crawls popular (as measured by incoming links/page rank) and reguarly updated blogs every 2 or 3 days. So you have at least a day to keep something you posted out of Google.

Now, as someone noted, if you publish an RSS feed, you have a shorter lead time. It depends on how long it takes from various RSS readers to pick up you feed. Bloglines (rough estimate) usually takes from 1 to 3 hours to pick up a feed. So if you write something you regret, but delete it quickly, you have only a short time before your Bloglines subscribers read it.

Of course, as your blog gets more popular, there is a good chance that at any give time, more than one person will be reading your blog. So, putting aside technical issues to answer your question... Concurring Opinions ain't no empty forest.

Posted by: Mike at November 16, 2005 02:45 PM


Deleting the URL from your blog program, doesn't necessarily delete it from the net.

Tip: To "delete" content from places like Bloglines' cache, replace the words on the page with new words, or a dot, or something else. But, do it quickly - before you post the number of posts (lastn="_") specified in your feed file. I hope that makes sense.

Anyway, when Bloglines makes its next pass, it'll cache the new words and the old words will be gone. In theory, this should work with Goggle's bot, too.

Posted by: Marie at November 16, 2005 05:09 PM


Post a comment




Remember Me?

(you may use HTML tags for style)

Authors

Daniel J. Solove

Website
Understanding Privacy

Kaimipono Wenger

Website
SSRN Page

Dave Hoffman

Website
SSRN Page

Nate Oman

Website
SSRN Page

Frank Pasquale

Website
SSRN Page

Deven Desai

Website
SSRN Page

Michael O'Shea

Website
SSRN Page

Sarah Waldeck

Website
SSRN Page

Lawrence Cunningham

Website
SSRN Page

Danielle Citron

Website
SSRN Page

Jaya Ramji-Nogales

Website
SSRN Page


Guests

Robert Ahdieh
Neil H. Buchanan
Miriam Cherry
Susan Kuo
Jonathan Lipson
Paul Ohm
Geoffrey Rapp
Susan Scafidi
Howard Wasserman
Timothy Zick






ad-logo3.jpg

blawg100_winner2.jpg

Previous Guests

Michael Abramowicz
Michelle Adams
Robert Ahdieh
Michelle Anderson
Laura Appleman
Francesca Bignami
Jeremy Blumenthal
Bruce Boyden
Donald Braman
Al Brophy
Bill Burke-White
Scott Burris
Anupam Chander
Miriam Cherry
Jack Chin
Jennifer Collins
Allison Danner
Brannon Denning
Deven Desai
Mike Dimino
Christine Haight Farley
Kim Ferzan
Dan Filler
Amanda Frost
Timothy Glynn
Rachel Godsil
Eric Goldman
Craig Green
Jeffrey Harrison
Erica Hashimoto
Carissa Hessick
Laura Heymann
Christine Hurt
Darian Ibrahim
Dan Kahan
Sam Kamin
Heidi Kitrosser
Adam Kolber
Russell Korobkin
Anita S. Krishnakumar
Greg Lastowka
Sarah Lawsky
Erik Lillquist
Jeff Lipshaw
Joseph Liu
Solangel Maldonado
Jason Mazzone
William McGeveran
Salil Mehra
Carrie Menkel-Meadow
Max Minzner
Scott Moss
Eric Muller
Jaya Ramji-Nogales
Elizabeth Nowicki
Paul Ohm
Michael O'Shea
Rafael Pardo
Marcy Peek
Eduardo PeƱalver
Neil RIchards
Lori Ringhand
Alice Ristroph
Paul Secunda
Jessica Silbey
Peter Smith
Charles Sullivan
Rick Swedloff
Steph Tai
Robert Tsai
Steve Vladeck
Sarah Waldeck
Melissa Waters
Alfred Yen
David Zaring
Timothy Zick
Jonathan Zittrain

Blogroll

Above the Law
ACS Blog
Althouse
Balkinization
Becker-Posner Blog
Beltway Blogroll
BlackProf
BoingBoing
Chicago Law Faculty Blog
Conglomerate
Convictions
CrimLaw
Crime & Federalism
CrimProf Blog
Crooked Timber
Discourse.net
Dorf on Law
Election Law
Emergent Chaos
Feminist Law Profs
43(B)log
Freakonomics Blog
Freedom to Tinker
Google Blogoscoped
How Appealing
Ideoblog
Info/Law
Instapundit.com
JD2B.com
Juris Novus
Jurisdynamics
Law and Letters
Legal Profession Blog
Legal Theory Blog
Legal Times Blog
Leiter Reports
Brian Leiter's Law School Reports
Lessig Blog
Madisonian
Mirror of Justice
National Security Advisors
Opinio Juris
Point of Law
Political Theory Daily Review
PrawfsBlawg
ProfessorBainbridge.com
Property Prof
Red Tape Chronicles
The Right Coast
Schneier on Security
SCOTUSBlog
Security Dilemmas
Sentencing Law and Policy
Simple Justice
Sivacracy.net
The Situationist
Susan Crawford
TalkLeft
Talking Points Memo
TaxProf Blog
Tech & Marketing Law
Truth on the Market
Volokh Conspiracy
WorkPlace Prof Blog
WSJ Law Blog
Wonkette
The Yin Blog

Pajamas Media BlogRoll Member