inari@piefed.zip to People Twitter@sh.itjust.worksEnglish · edit-229 天前Managersmedia.piefed.zipimagemessage-square179linkfedilinkarrow-up1967arrow-down13
arrow-up1964arrow-down1imageManagersmedia.piefed.zipinari@piefed.zip to People Twitter@sh.itjust.worksEnglish · edit-229 天前message-square179linkfedilink
minus-squareKaligalis@lemmy.worldlinkfedilinkarrow-up16arrow-down1·29 天前It might not be as impossible as it sounds. Some of the “open” models are rumored to be able to code. The real problem is that you likely need something with 128 GiB VRAM to run them with a reasonably large context window.
minus-squareIratePirate@feddit.orglinkfedilinkarrow-up7·29 天前An Nvidia B200 (192 Gigs of RAM) sells somewhere between 30-50k a pop. That’s feasible for a company.
minus-squareKazumara@discuss.tchncs.delinkfedilinkarrow-up4·edit-228 天前And then you can serve one inference at a time. Hopefully your devs are well distributed over timezones :-)
minus-squareDiurnambule@jlai.lulinkfedilinkarrow-up5·edit-228 天前Wonderfull idea, may be they can connect to the same PC, and we can call it main frame or something. xD
minus-squarebaronofclubs@lemmy.worldlinkfedilinkarrow-up3·28 天前I don’t see why it wouldn’t be feasible to rent someone else’s computer to use for something like this, seeing how it could amortize costs over time.
minus-squaremindbleach@sh.itjust.workslinkfedilinkarrow-up4·29 天前Qwen’s 27B model from April outperforms its 397B model from February. Local and small were always going to win.
minus-squareDiurnambule@jlai.lulinkfedilinkarrow-up1·28 天前Qwen 3.6 ? It is unstable though. It go awry more often than the 3.5 of the same size.
It might not be as impossible as it sounds. Some of the “open” models are rumored to be able to code. The real problem is that you likely need something with 128 GiB VRAM to run them with a reasonably large context window.
An Nvidia B200 (192 Gigs of RAM) sells somewhere between 30-50k a pop. That’s feasible for a company.
And then you can serve one inference at a time. Hopefully your devs are well distributed over timezones :-)
Wonderfull idea, may be they can connect to the same PC, and we can call it main frame or something. xD
I don’t see why it wouldn’t be feasible to rent someone else’s computer to use for something like this, seeing how it could amortize costs over time.
Qwen’s 27B model from April outperforms its 397B model from February.
Local and small were always going to win.
Qwen 3.6 ? It is unstable though. It go awry more often than the 3.5 of the same size.