I don’t know if thinking that training data isn’t going to be more and more poisoned by unsupervised training data from this point on counts as “in practice”
I don’t know if thinking that training data isn’t going to be more and more poisoned by unsupervised training data from this point on counts as “in practice”
I like using it like a rubber ducky. I even have it respond almost entirely in quacks.
Note: it’s a local model running for free. Don’t pay anyone for this slop.
“When asked about buggy AI, a common refrain is ‘it is not my code,’ meaning they feel less accountable because they didn’t write it.”
That’s… That’s so fucking cool…
Main issue is drivers. One of the best places to take advantage of rust’s memory safety is in hardware drivers, and those would be hard to share between separate kernels.
That entire talk, and the complaint that Ts’o responded to was that to continue with rust, there needs to be some responsibility from the guys working on the underlying C bindings to not break downstream dependencies if they refactor code.
The answer from some of the Kernel developers, and vocally by Ts’o was: lol no fuck you and your toy language.
Whisper’s code and model weights are released under the MIT License. See LICENSE for further details. So that definitely meets the Open Source Definition on your first link.
Model weights by themselves do not qualify as “open source”, as the OSAID qualifies. Weights are not source.
Additional WER/CER metrics corresponding to the other models and datasets can be found in Appendix D.1, D.2, and D.4 of the paper, as well as the BLEU (Bilingual Evaluation Understudy) scores for translation in Appendix D.3.
This is not training data. These are testing metrics.
Edit: additionally, assuming you might have been talking about the link to the research paper. It’s not published under an OSD license. If it were this would qualify the model.
Those aren’t open source, neither by the OSI’s Open Source Definition nor by the OSI’s Open Source AI Definition.
The important part for the latter being a published listing of all the training data. (Trainers don’t have to provide the data, but they must provide at least a way to recreate the model given the same inputs).
Data information: Sufficiently detailed information about the data used to train the system, so that a skilled person can recreate a substantially equivalent system using the same or similar data. Data information shall be made available with licenses that comply with the Open Source Definition.
They are model-available if anything.
> Kinito Pet now playable
How the fuck is that gonna work