I've been working on understanding the unique language of Craiglist man for man personal ads. Here's one measure: a list of the words that show up frequently in ads but relatively seldomly in ordinary English, in particular television and movie scripts.
pic host cock stats suck masculine asian discreet horny oral smooth reply neg hairy im bi cum thick email hiv latin hosting std hwp seeking vers ddf gwm athletic sucking uncut lbs latino gl muscular jock porn cocksucker slim poppers nsa ur bj pnp horned bb discrete seeks etc rim versatile dd dont stocky beefy shaved cocks submissive swap emails sensual loads bod toned anon anal dom cuddle bs chubby dominant blowjob goatee shaven hookup font husky replies cant serviced stroking asians whats completion slender ft cuddling buzzed stds mod filipino moderately kink muscled oakland servicing nips unzip uc orally boyish cl yr br scruffy endowed
The word pic (or pics) shows up once every 120 words in a Craiglist ad, but only one in a million in TV/movie scripts. Cock is about 800 times more likely to occur in a Craiglist ad than in a script. The non-sexually-explicit terms like moderately and masculine interest me most.

For completeness, here are the top words that are common in TV scripts but not in Cragislist m4m ads. A lot of proper names and feminine terms.

she's uh mother um hmm whoa mrs daughter killed wedding sonny upset theresa ooh she'll luis grace they'll sweetheart julian antonio ms charity billy ray miguel kay evidence sweetie shawn mommy mama barbara elizabeth congratulations aw mother's jennifer witch skye father's gosh ian maria powers mitch witness eddie hank grandma harmony bloody everybody's
culture
  2009-04-10 20:34 Z
Following on my new fame for looking at the ages of Craiglist ads, I did a bunch of crunching and got six months of collected ad data into a database. Which makes it easy to produce reports like the following:

There's a 10:1 variance in ad volume based on the hour. The quietest times are 4am with about 15 ads an hour, the busiest times are around 8pm with about 150 ads an hour. Noticeable bump when people get home from work, surprisingly little variation in day of the week. Overall the distribution looks like graphs of pretty much any Internet activity with some bias towards more use in the evenings.

What I really want to get at is the content of the ads, classify them by desired partner, desired scene, drugs, desperation, etc. I need to chat with someone who understands text clustering.

culture
  2009-03-06 23:34 Z
I'm fascinated by the Craigslist man for man personal ads (NSFW). I'm not in the market myself but the discourse is so efficient that I enjoy reading it in idle moments, it's a great capsule of gay casual sex culture. Most of the ads are quite terse: the poster's age, a self-description, a description of the man they're looking for, preferred sexual acts, and a proposed location.

For the past seven months I've been archiving the RSS for the Bay Area boards, a collection of 485,000 unique personal ads. Here's the distribution of ages of posters.

The data isn't terribly surprising. I suspect it's a mix of Craigslist's demographics and the relation between sexual desire and sexual access as men age. Note that relatively few people are 31, 41, or 51 on Craigslist.

I'm hoping to do more analysis. I'm particularly curious about the relation of the poster's age and the age of their preferred partner, but the data is a bit fuzzy.

culture
  2009-02-23 18:44 Z