On this very thread you already have people talking about "open weights" and similar nonsense. What is open about them? They're free to download, but that hardly qualifies as open. Where is the source? Where are the instructions to modify and build your own?
I'd never though I'd have to utter the expression "open as in beer".
The blatant attempt at manipulating vocabulary here is... quite blatant.
I'm a strong proponent of Open Source (TM) but I disagree with this take.
The weights are the useful artifact here. You can modify them, fine tune them and do what you want with them.
Unlike binary software there is nothing limiting that.
It is also useful to have access to the training recipes and to some extent the data. But I'm of the opinion that learning on something is not copyright infringement, so there are many circumstances where distributing the raw training data will not be possible.
For me this is like Open Office: it is open source, and largely inspired by and learned from Microsoft Office. But they don't need to distribute MS Office for Open Office to be Open Source.
In addition there are models that meet the criteria you appear to propose. The AllenAI models are a good example.
The analogy falls apart very quickly. Without the training data, your modifications amount to virtually nothing compared to what these "versions" are, and the idea that you can maintain and improve on these models without the continual support of the company that owns the training data AND harnesses AND in general build instructions is not very credible. This is why it's not rare that they "dump" old versions as freeware but at some point switch to not distributing them, and mostly get away with it. As this is really not open, and the threat of an effective fork is therefore non-existent, the pressure for any one who has released freeware models to "go SaaS" is too high.
While if "Open Office" switches to a more problematic license at some point, the existing source has all you need for an organization to support the project without regard to the original company (this has happened already!). If Qwen decides to stop distributing models for download, you're basically stuck, _even_ if you have unlimited resources, it's not clear how the released weights help you; your best bet is to start almost from scratch. This has also happened...
These models are not "Open" by any definition of the word. It is just freely redistributable. You can justify yourself in whatever way you want re a cowboy approach to copyright, but this doesn't change the fact that this is not open, and has almost none of the benefits of open, and therefore it is a huge abuse of the word "Open".
Ironically about the only thing that is copyrightable here is the sum of the training data (possibly) _AND_ the software used to build the model (most definitely). The model itself most likely isn't (databases are not copyrightable), which makes it even more pointless to abuse the word "open" for it. All the value is in the former two.
> The analogy falls apart very quickly. Without the training data, your modifications amount to virtually nothing compared to what these "versions" are, and the idea that you can maintain and improve on these models without the continual support of the company that owns the training data AND harnesses AND in general build instructions is not very credible.
This is completely wrong, and sort of shows why what you are saying is not a problem at all.
You can post-train any LLM very easily without access to the original training data.
People do it all the time.
Cursor post-training Kimi K2 is a great example.
> If Qwen decides to stop distributing models for download, you're basically stuck, _even_ if you have unlimited resources, it's not clear how the released weights help you; your best bet is to start almost from scratch.
What are you talking about? You just post-train it.
There is exactly zero different before and after they stop distributing it. People don't have access to the training data now (when they are distributing it) and post train very successfully.
The weights are created through training. The 'source' would be the training data, which is going to be a massive amount of data, and is not something that could just be easily shared.
do note that even if you don't do shell expansion you're still subject to "smart" programs interpreting a single argv that starts with a dash as a parameter and its argument. I'm sure there's going to be a CVE about this at some point if there hasn't already.
And don't even care to make a serious effort to get it back. I suspect if they tried using the UDRP with a claim "we lost it by accident, cybersecurity risk, current owner is just squatting on it without actively using it" – they'd have quite decent odds of success, given the attitudes of the average UDRP arbitrator. The current holder would of course argue "you lost it more than a decade ago, you should be estopped by the passage of time" – but again, the average UDRP arbitrator would likely weigh the "cybersecurity risk" argument higher.
Espressif products are not ideal for Bluetooth audio since support for classic Bluetooth (which is what is still mostly used for Bluetooth audio) is hit or miss , and on newer models often entirely missing.
I was not aware that _any_ Espressif hardware even supported classic bluetooth other than the very first ESP32 (which I am not sure if they're even available). And I was getting around 50ms latency back then (with the original ESP32 and SBC!)
I’m using BLE GATT messaging with an upgrade path to L2CAP CoC channels for clients that support it. Roughly the path is: audio input -> opus encode -> BLE transmit -> smartphone/desktop. The latency floor ends up being ~80ms due to jitter buffer sizing, etc.
That statement doesn't stand on its own. For example, the most popular OS for laptops at my place of work is Windows. It has very little to do with what people want or price. It has almost everything to do with ecosystem lock in.
A significant portion of windows laptop market share comes from corporate purchases.
Is this totally true? There is advertising, marketing spend and retail shelf space. Surely it's more complex than "solves users problem at price point."
Advertising and marketing spend exist to make people aware of the device's capabilities and its price. I would be surprised to find that any consumer chooses a device because of its marketing spend and retail shelf space.
You wrote “I would be surprised to find that any consumer chooses a device because of its marketing spend”. But advertising does skew consumer choice by its presentation, and the success correlates with marketing spend. It’s far from merely informational. Otherwise we’d just have black on white listings of “this product exists” with spec sheets.
Most likely, the goal here was copyright laundering.
reply