Hacker News new | past | comments | ask | show | jobs | submit | best comments login

The best approach to circumventing the nondisclosure agreement is for the affected employees to get together, write out everything they want to say about OpenAI, train an LLM on that text, and then release it.

Based on these companies' arguments that copyrighted material is not actually reproduced by these models, and that any seemingly-infringing use is the responsibility of the user of the model rather than those who produced it, anyone could freely generate an infinite number of high-truthiness OpenAI anecdotes, freshly laundered by the inference engine, that couldn't be used against the original authors without OpenAI invalidating their own legal stance with respect to their own models.


It probably would be better to switch the link from the X post to the Vox article [0].

From the article:

“““

It turns out there’s a very clear reason for [why no one who had once worked at OpenAI was talking]. I have seen the extremely restrictive off-boarding agreement that contains nondisclosure and non-disparagement provisions former OpenAI employees are subject to. It forbids them, for the rest of their lives, from criticizing their former employer. Even acknowledging that the NDA exists is a violation of it.

If a departing employee declines to sign the document, or if they violate it, they can lose all vested equity they earned during their time at the company, which is likely worth millions of dollars. One former employee, Daniel Kokotajlo, who posted that he quit OpenAI “due to losing confidence that it would behave responsibly around the time of AGI,” has confirmed publicly that he had to surrender what would have likely turned out to be a huge sum of money in order to quit without signing the document.

”””

[0]: https://www.vox.com/future-perfect/2024/5/17/24158478/openai...


That OpenAI are institutionally unethical. That such a young company can be become rotten so quickly can only be due to leadership instruction or leadership failure.

Look at Sam Altman's career and tweets. He's a clown at best, and at worst he's a manipulative crook who only cares about his own enrichment and uses pro-social ideas to give himself a veneer of trustworthiness.

So part of their compensation for working is equity, and when they leave thay have to sign an additional agreement in order to keep their previously earned compensation? How is this legal? Mine as well tell them they have to give all their money back too.

What's the consideration for this contract?


> And that typed text is way, way cleaner than any typewriter I’ve seen.

Pedantic point: electric typewriters (which have existed since the 1960s) do type in a way that looks exactly like this.

(In fact, note that the text on the real employee ID card, shown later in the article, doesn't look any less clean! It's just set in a different, narrower font.)

The smudginess of mechanical typewriters comes from 1. them striking (and especially, releasing) at the same speed you're depressing the key, and 2. having many of the keys necessarily approach the ribbon from an angle.

The keys being swung weakly by your fingers, also has the additional implication that the ink ribbons used in mechanical typewriters have to be soft and squishy (so: made of cloth), and use thin inks. These properties ensure a transfer from even a low-velocity impact. But the trade-off is that cloth ink ribbons transfer only a rough outline of what's struck; and thin inks are high-bleed inks.

An electric typewriter, playing out a pre-buffered line with a crisp, predictable report, using linear actuators and a rotating-ball type-head to bang a tape ribbon loaded with high-viscosity ink onto the page, can create text indistinguishable from books/newspapers of the same period, or from modern laser-printer reproductions of the same font faces. They're essentially character-at-a-time letterpresses!

(Also, ignoring electric typewriters for a sec: inks bleed more on thin, cheap paper. But this is [a forgery of] an employee ID card — where, for durability, a nice heavyweight paper or cardstock would have been used. You're always going to get a better-looking result inking such paper.)


A lot of the brouhaha about OpenAI is silly, I think. But this is gross. Forcing employees to sign a perpetual non-disparagement agreement under threat of clawing back the large majority of their already earned compensation should not be legal. Honestly it probably isn't, but it'll take someone brave enough to sue to find out.

> For any model that will be used broadly across all of our customers, we do not build or train these models in such a way that they could learn, memorise, or be able to reproduce some part of Customer Data

This feels so full of subtle qualifiers and weasel words that it generates far more distrust than trust.

It only refers to models used "broadly across all" customers - so if it's (a) not used "broadly" or (b) only used for some subset of customers, the whole statement doesn't apply. Which actually sounds really bad because the logical implication is that data CAN leak outside those circumstances.

They need to reword this. Whoever wrote it is a liability.


In the aughts I worked at Adobe and spent time trying to archive the source code for Photoshop, Illustrator, PostScript, and other apps. Thomas Knoll's original Mac floppy disk backups were available, so I brought in my Mac Plus, with a serial cable to transfer the files to a laptop via Kermit. The first version was 0.54, dated 6 July 1988. The files on the floppies were in various ancient compressed archive formats, but most were readable. I created an archive on a special Perforce server of all the code that I found. Sadly, the earliest Illustrator backups were on a single external disk drive that had gone bad.

Everyone replying with "what's the big deal?" is showing their tech privilege. You may not have to deal with intrusive monitoring, but warehouse workers are increasingly being made to wear ankle bracelets so every movement of theirs can be monitored and stack ranked. Workers in WFH "gig" jobs are made to install always-on keyloggers and other monitoring software on their personal computers and phones (which are required for the job). Companies take photos/videos of them in their homes every few minutes throughout the day. Plenty of jobs require you to hand your social media passwords to your employer. There is an entire class of companies that specialize in all of this.

Not everyone is able to say "no" to all this and still make rent next month. I'm happy the government is finally stepping in.


It’s like $100 per board now once you add a power supply and a case. More if you also add storage. Cheapest Intel system on Amazon is $139. The whole point of the entire thing was its affordability. That was kind of lost along the way.

Accessibility is for everyone, including you, if you live long enough. And the alternative is worse. So your choice is death or you are going to use accessibility features. – Siracusa

OP comes around with some of the coolest things posted in HN recently, and all he gets is extensive criticism, when it is clear that this is an early version :/

It's time to find a lawyer. I'm not one but there's an intersection with California SB 331, also known as “The Silenced No More Act”. while it is focused more on sexual harrasment, it's not limited to that, and these contracts may run afoul of that.

https://silencednomore.org/the-silenced-no-more-act


I love accessibility features because they might be the last features developed solely with the benefit of the user in mind. So many other app/os features are designed to steal your attention or gradually nerf usefulness.

Has it really been 9 years since I started working on Ubershaders?

I'm a little surprised no better solution has come along. Vulkan didn't even exist back then (and DirectX 12 had only just released) but instead of making things better, it digs it's feet even deeper into the assumption that all shaders will be known ahead of time (resulting in long "shader recompilation" dialogs on startup on many games).

I've been tempted to build my own fast shader compiler into Dolphin for many common GPU architectures. Hell, it wouldn't even be a proper compiler, more of a templated emitter as all shaders fit a pattern. Register allocation and scheduling could all be pre-calculated.

But that would be even more insane than ubershaders, as it would be one backend per gpu arch. And some drivers (like Nvidia) don't provide a way to inject pre-compiled shader binaries.

On the positive side, ubershaders do solve the problem, and modern GPU drivers do a much better job at accepting ubershaders than they did 9 years ago. Though that's primarily because (as far as I'm aware) examples of Dolphin's ubershader have made their way into every single shader compiler test suite.


Remember: if VCs believed in what they were doing they would not take a 2% annual management fee and 20% of the upside.

They’d take 40% of the upside and live on ramen noodles.

VCs make money by raising money from LPs.

They spend this money on investments which don’t look too bad if they fail, because nearly all of them fail. Looking good while losing all of your investors money on companies which go broke is the key VC skill.

Once in a while you get a huge hit. That’s a lottery win, there is no formula for finding that hit. Broad bets helps but that’s about it. The “VC thesis” is a fundraising tool, a pitch instrument, it makes no measurable difference to success. It’s a shtick.

Sympathy, however, for the VC: car dealership sized transactions paired with the diligence burdens of real finance. It’s a terrible job.

Once you understand that VC is one of the worst jobs in finance and they don’t believe most of their own story — it’s fundraising flimflam for their LPs - it’s a lot easier to negotiate.

1) we are a sound bet not to get you in trouble if we fail (good schools and track records)

2) we will work hard on things which your LPs and their lawyers understand, leaving evidence of a good effort on failure

3) we know how the game works and will play by the unwritten rules: keep up appearances

The kind of lunatics who actually stand to make money with a higher probability than average - the “Think Different” category - usually violate all of these rules.

1) they have no track record

2) they work on esoteric nonsense

3) they look weird in public

And they’re structurally uninvestable.

Once you get this it’s all a lot easier: the job of a VC is not to invest in winners, that’s a bonus.

The job of a VC is to look respectable while losing other people’s money at the roulette wheel, and taking a margin for doing so.

I hope that helps.


I was involved with implementing the DNF volume counting version of this with the authors. You can see my blog post of it here:

https://www.msoos.org/2023/09/pepin-our-probabilistic-approx...

And the code here: https://github.com/meelgroup/pepin

Often, 30% of the time is spent in IO of reading the file, that's how incredibly fast this algorithm is. Crazy stuff.

BTW, Knuth contributed to the algo, Knuths' notes: https://cs.stanford.edu/~knuth/papers/cvm-note.pdf

He actually took time off (a whole month) from TAOCP to do this. Also, he is exactly as crazy good as you'd imagine. Just mind-blowing.


I worked at deviantArt from 2009 to 2013. It was my dream job. At the time deviantArt made money a few ways.

In no particular order, because I don't know which were profitable or which represented a larger portion of revenue:

- Subscriptions (users could pay for a few extra features and to disable ads on the site) - DeviantArt branded merch. - Prints and products with users' art printed on them - Sponsored Contests. These promoted movies or other media properties, or software of interest to artists. Often the prizes included Wacom tablets and Adobe Photoshop licenses.

During my time there a significant problem we were dealing with was due to deviantArt's stance on adult content. Anything was allowed as long as it wasn't outright pornography. In practice that meant that nudes were allowed but sexual acts were not. This had consequences for deviantArt's revenue. It meant that we could not run ads from the "reputable" ad networks and were forced to deal with seedier outfits that often (e.g. constantly) included malware in the display ads, exposing users to all sorts of nasty stuff. One of my projects was to detect and/or prevent the malware ads which proved challenging and at least given the amount of resources devoted to it, it was not very fruitful.

It really is sad for me to see what deviantArt has devolved into. Once the original founders sold out a few years ago I really didn't hold out much hope for the site's future.


The way VCs filter out potential investments seems fairly similar to the way Ivy League schools filter out potential students. (Probably because they are comprised of the same people.) It is not really about technical brilliance, or innovation, or anything that is written on their website as a core value. It's more about whether you're smart enough and can follow instructions and fit into the overarching institutional structure of school and work.

For companies that are at the point of raising venture capital, this might be what is actually needed. But it certainly seems like it filters out a lot of the more idiosyncratic, brilliant types that aren't concerned with (from their perspective, irrelevant) details, like the date on a pitch deck. It seems like a good way to get institutional operators, not rare but not-quite-conformist innovators. I can't imagine someone like Steve Jobs or Nikola Tesla passing these VC/Ivy League kinds of tests.


I run one of the largest email sending services on the internet. I have been living the “mess” of internet email for over 20 years.

Here’s the thing: despite the internet’s email system being complex and confusing and riddled with problems, it is universally adopted and interoperable. No other open communication tool can boast of email’s massive interoperability.

Any new system will face the impossible burden of winning hearts and minds while the old system continues to chug along with billions of annual R&D spending and dozens of conferences full of smart people working on solving problems.

Even if “MX2” can peacefully coexist in the DNS, why would anyone spend the millions of dollars in engineering effort to move while their teams are busy building the latest layer that’s been invented to patch over the current system?

By all means if you want to propose a new email system, show up at IETF or M3AAWG and make a bold proposal. Someone will buy you a beer while they explain why you are much better off getting into the mud pit with the rest of us and working on the next pragmatic fix to keep things rolling along.


I love this. I read so much criticism here, but noise pollution is a main issue when it comes to railroads in residential zones.

Yes, punctuality is an issue with Deutsche Bahn. No, this doesn’t fix that instantly. But as an organisation you can work on two things at the same time.

This invention is spectacular. I wish more people would work on noise pollution. It makes a huge difference.


So much of the DeviantArt story reads like Tumblr. Two platforms appealing to amateur and small artists grow to great relevance among a patchwork of subcultures. Then, they start trying to turn a profit and end up alienating the entire userbase that carried them to that point. DeviantArt is much further down that road than Tumblr is, though. It's sad to see. Both platforms were key to the WWW of my childhood.

I wish the artists well in their AI copyright legal pursuits.


This algorithm seems to resemble HyperLogLog (and all its variants), which is also cited in the research paper. Using the same insight of the estimation value of tracking whether we've hit a "run" of heads or tails, but flipping the idea on its head (heh), it leads to the simpler algorithm described, which is about discarding memorized values on the basis of runs of heads/tails.

This also works especially well (that is, efficiently) in the streaming case, allowing you to keep something resembling a "counter" for the distinct elements, albeit with a error rate.

The benefit of HyperLogLog is that it behaves similarly to a hash set in some respects -- you can add items, count distinct them, and, importantly, merge two HLLs together (union), all the while keeping memory fixed to mere kilobytes even for billion-item sets. In distributed data stores, this is the trick behind Elasticsearch/OpenSearch cardinality agg, as well as behind Redis/Redict with its PFADD/PFMERGE/PFCOUNT.

I am not exactly sure how this CVM algorithm compares to HLL, but they got Knuth to review it, and they claim an undergrad can implement it easily, so it must be pretty good!


Comcast lobbied so hard to get our local mileage for fiber expansion voted down. When it passed, they suddenly decided they’re going to spend 6 figures worth of money to expand their own fiber backdown (they’re not even to the house). Its amazing when corporations will spend more money fighting something than actually implementing said thing.

This is kinda crazy considering the tremendous effort Google has gone to over the decades to shave milliseconds off their response time. They invented a whole TCP replacement to reduce page latency, and now this?

Extra respect is due to Jan Leike, then:

https://x.com/janleike/status/1791498174659715494


(2019)

This is from Derek Lowe's wonderful "Things I won't work with" series from his long running blog.

Even if you're not a chemistry person, it's written in an accessible way, and usually quite humorous way.

I recommend his 2010 post on "FOOF" Dioxygen Difluoride from the same series.

[1] https://www.science.org/content/blog-post/things-i-won-t-wor...


Animation of how HIV infects a single T-cell

https://vimeo.com/260291607


There is exactly nothing about defacing public infrastructure which elicits the tiniest spark of "admiration" or "interest" in me.

If you want to live out your "artistic ambitions" do it somewhere away from public property.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: