Here's another massive database of phone records!
http://www.whitepages.com/1063/person
I guess they're spying on us.
http://www.whitepages.com/1063/person
I guess they're spying on us.
Database ClueBat
Kim du Toit
May 12, 2006
9:25 AM
I don?t know much about a lot of stuff, but I know a great deal about databases and how to use them?and I especially know a great deal about how to manage usage of terrabytes of data. In a past life, I ran a customer database of grocery purchases (those annoying little loyalty cards that most supermarkets use to collect your data).
Just so we?re all clear on this concept: the average supermarket carries about 40,000 different items (called stock-keeping units, or SKUs), and the average supermarket processes about one million transactions (sometimes called ?baskets") a year. The chain I last did this for on a full-time basis had just under 300 stores, and a database of about 3 million active customers ("active" defined as anyone who shopped with us at least once over the past six months).
A lot has been written about how these programs intrude on people?s privacy, and how this means that your shopping purchases can be tracked. Allow me to reassure you: almost nobody ever looks at a single customer?s item purchases?there are just too many items, and too many customers.
What I did was design ways to make data management easier?it?s what I still do?and I always operated on the 80:20 principle (that 20% of the people will account for 80% of the activity).
Which meant that I ignored 80% of all customers? information. I was only interested in those people who spent a lot of money with us (the 20%), because the data showed that not only did those people account for 80% of sales, they accounted for about 98% of our profits.
And the reason I only looked at that group was that if I could effect a change in their behavior (get them to spend a little more each week, for instance), the effect on the entire business was disproportionate to the effort involved.
More to the point, in all that time, I can count on two hands the number of times each year that I ever looked at any single customer?s purchases?and even then, it was to check the data, or for a merchandising purpose. Here?s an example: suppose the buyers decided that a particular item wasn?t selling, and they decided to discontinue ("de-list") the item in favor of one which was selling more, or to give the slow item?s shelf space to an existing best-seller. Good, sound merchandising.
However, if that item was being bought by our best customers, then I would argue for the item to be kept in stock, because if the customer didn?t find it at our store, she would go and find it somewhere else and we could, potentially, lose that ?best? customer to our competitor?which was our biggest nightmare.
Mostly, however, I looked at groups of customers: people who shopped regularly at the Deli department, or people who bought hot dogs but not buns, or people who visited the stores more than x times per week, or people who spent more than $400 every time they walked through the door. All that, to decide on what promotions to offer, or how to design new promotions, and so on. At the time I was doing this, by the way, I was one of about (maybe) a half-dozen people in the United States who actually knew what they were doing, and that only because not only had I been in the supermarket business for 20 years, but because I was also a data geek?the two characteristics combined were not common in the grocery business at the time, and still aren?t that common even today.
So, to sum up: I had to collect sales across a combination and permutation of 300 (stores) x 1,000,000 (transactions) x 40,000 (items) x 3,000,000 (customers) each year, just so I could look at small groups of customers? data. It was a process which was so massive, and so complex, that very few organizations could even begin to manage it?and many still can?t. But I had to collect it all, so that I could look at just a little?there?s no way to collect just a little data because in the beginning, you don?t know what?s important and what isn?t.
Which brings me, finally, to the topic du jour, that of the NSA collecting call data from the phone companies.
Of course, there?s been a lot of bloviating from the Terminally Stupid about how the NSA is ?trolling? and ?invading privacy? by collecting the phone data of millions of citizens. Well, it?s not that simple, although the Evil Ones would like it to be.
For a start, the NSA is collecting only a couple of pieces of information: the originating phone number, and the destination phone number, and (probably) the date of the call, and (maybe) the call?s duration (although I can?t see why this would be important, but that?s the nature of the beast).
The reason they?ve been collecting this data since 9/11 was because someone at NSA was being really, really smart: if terrorists are communicating by phone, it?s possible to establish linkages between numbers, and install pattern-recognition software to collect those linkages. And the reason that this was a smart thing to do is a simple one: the phone company doesn?t store this data beyond (maybe) a few years?the amount is just too massive to hold forever?and lest we forget, we?re coming up on the 5th anniversary of 9/11 already.
Note that none of this requires any names, nor the content of the calls?that would be the privacy of the thing, and that?s where it seems that the NSA, if they?re telling the truth, has been quite circumspect.
But what this data gives the smart analyst is that when you establish that (357) 243-3006 belonged to Abdul El-Bomba, who received a call from his brother Aziz, a known member of Hezbollah in Syria, you now have the ability to focus only on all the calls Abdul made and received, to see who was calling him and whom he was calling. That would be a couple hundred calls, out of the (literally) tens of billions of records you?ve collected.
Here?s the Big Clue for the Clueless: if you don?t collect all the data, you can?t narrow the search at all. And it?s only once you?ve established that Abdul is a Bad Guy that you ascertain his number, and the numbers of his correspondents, and their names. Most of the calls will be innocent: the dry cleaners, the gas company, the liquor store, whatever.
But out of the couple hundred calls, you may find five that are to Mohamed Semmteks, and to Tariq Pilota, who are also terrorists, and whose calls you can now start investigating.
So from tens of billions to a couple hundred to five. And in these cases, it?s NOW when you, as the investigator, can get a warrant for a wiretap so you can start listening to actual content, which, out of all the data mentioned so far, is the only part protected by the First Amendment.
That?s how to do it?and more importantly, that?s the only way to do it when you?re starting from scratch.
As far as the vast majority of us are concerned, there?s not much to worry about. Nobody at the NSA is interested in the call you made to your Mom, or even to the call you made to your mistress. Don?t kid yourself: you?re not that interesting.
Just as I was never interested in whether Betsy Smith bought Tide or Tidy-Cat.
But I have to tell you, I am really glad that someone at the NSA was doing their job, and began to collect the data a long time ago?because otherwise it would now be gone, and we?d be behind the curve, just as we were on 9/10/2001.
Excuse me? Accessing phone records (that is, a list of who called who when) requires a warrant. Was a warrant issued for this? No? Then it's illegal. I'll hold my ground on this one and continue screaming "illegal" because the law is on my side. 18 USC 2703 (c)its legality is uncertain pending more specific information
(2) Paragraph (1) is applicable with respect to any electronic communication that is held or maintained on that service–
(A) on behalf of, and received by means of electronic transmission from (or created by means of computer processing of communications received by means of electronic transmission from), a subscriber or customer of such remote computing service; and
(B) solely for the purpose of providing storage or computer processing services to such subscriber or customer, if the provider is not authorized to access the contents of any such communications for purposes of providing any services other than storage or computer processing.
(c) Records concerning electronic communication service or remote computing service.--
(1) A governmental entity may require a provider of electronic communication service or remote computing service to disclose a record or other information pertaining to a subscriber to or customer of such service (not including the contents of communications) only when the governmental entity--
(A) obtains a warrant issued using the procedures described in the Federal Rules of Criminal Procedure by a court with jurisdiction over the offense under investigation or equivalent State warrant;
(B) obtains a court order for such disclosure under subsection (d) of this section;
(C) has the consent of the subscriber or customer to such disclosure; or
(D) submits a formal written request relevant to a law enforcement investigation concerning telemarketing fraud for the name, address, and place of business of a subscriber or customer of such provider, which subscriber or customer is engaged in telemarketing (as such term is defined in section 2325 of this title); or
(E) seeks information under paragraph (2). (more specific information)
Well that governemnt staff is made up of people who have done things that amaze me for a little bit of money. Lets call them criminalocrats. With that much information whats to keep some of these bad apples from misusing it?
Lets call them criminalocrats.
The "electronic communications" services you cite are defined in 18 USC 2510(12)(A) as specifically excluding "any wire or oral communication."by Go/27:
I'll hold my ground on this one and continue screaming "illegal" because the law is on my side. 18 USC 2703 (c)
That's a tacky diversion that I thought would have been beneath you. I have already stated that I think the government's program is wrong. Do you think your question will browbeat me to label the program illegal on an emotional basis as you have done?by Go/27:
What's your phone number?
???? One (or a few) bad apple equals they're all a bunch of crooks and should not be trusted ????
How about one kook with a gun equals all gun owners are kooks and should be outlawed?
Be careful with how wide a stripe you paint with that brush.
Dean
So what about those files J Edgar maintained on people like Martin Luther King and those tapes that were passed around? The files he maintained on others?Let's not, because you're presuming widespread guilt. I'm not happy about this either, and want it investigated, but starting from a position whereby you assume that the government is populated heavily with criminal types isn't going to strengthen your case or mine. That seems to be your implication.
Do you think your question will browbeat me to label the program illegal on an emotional basis as you have done?
I have already done so twice before.No, I expect to force you to acknowledge that a person's phone number and calling habits are indeed *private* information.
I have already done so twice before.
Chapter 121 deals with "electronic communication" (not "oral communications" as you suggest). The definitions for Chapter 121 (which are in Chapter 119) specifically exclude "any wire or oral communication." [18 USC 2510(12)(A)]Every possible access by the government of oral communications or transaction records *specifically* calls for a warrant. No matter how you slice it, they have no authorization to do this.
?wire communication? means any aural transfer made in whole or in part through the use of facilities for the transmission of communications by the aid of wire, cable, or other like connection between the point of origin and the point of reception (including the use of such connection in a switching station) furnished or operated by any person engaged in providing or operating such facilities for the transmission of interstate or foreign communications or communications affecting interstate or foreign commerce;