Part of what makes localized model engines and custom ML chips interesting is precisely their ability to enable small custom local models. Right now LLMs require so much computational power and massive amounts of data to be trained and operate that even the most expensive options lose money with every prompt query.
So, the reason every tutorial starts with “download this model”. Is because there’s a good chance you don’t have the hundreds of super computer cluster chips and the several hundreds of exabytes of scrapped and curated data needed to train a natural language processing model. There’s a reason there are only big players in this game.
Even if you could design your own model… How do you acquire a dataset even a fraction of the size those pretrained models from corps.
Then how do you train the model in a reasonable time. Other than relying on cloud computing which leads to the same problem I outlined before of only corps can play this game properly right now.
I designed and collected/labeled the data for a relatively small deep CNN for my masters thesis and training it on 60000 images was taking over a dozen hours (this was 5 years ago at this point so that part may be misremembered) on a 1080ti.
Part of what makes localized model engines and custom ML chips interesting is precisely their ability to enable small custom local models. Right now LLMs require so much computational power and massive amounts of data to be trained and operate that even the most expensive options lose money with every prompt query.
So, the reason every tutorial starts with “download this model”. Is because there’s a good chance you don’t have the hundreds of super computer cluster chips and the several hundreds of exabytes of scrapped and curated data needed to train a natural language processing model. There’s a reason there are only big players in this game.
Facts.
Even if you could design your own model… How do you acquire a dataset even a fraction of the size those pretrained models from corps.
Then how do you train the model in a reasonable time. Other than relying on cloud computing which leads to the same problem I outlined before of only corps can play this game properly right now.
I designed and collected/labeled the data for a relatively small deep CNN for my masters thesis and training it on 60000 images was taking over a dozen hours (this was 5 years ago at this point so that part may be misremembered) on a 1080ti.