swiftywizard@discuss.tchncs.de to Programmer Humor@programming.dev · edit-25 个月前Lavalamp too hotdiscuss.tchncs.deimagemessage-square75linkfedilinkarrow-up1499arrow-down113
arrow-up1486arrow-down1imageLavalamp too hotdiscuss.tchncs.deswiftywizard@discuss.tchncs.de to Programmer Humor@programming.dev · edit-25 个月前message-square75linkfedilink
minus-squaredream_weasel@sh.itjust.workscakelinkfedilinkarrow-up4·5 个月前This kind of stuff happens on any model you train from scratch even before training for multi step reasoning. It seems to happen more when there’s not enough data in the training set, but it’s not an intentional add. Output length is a whole deal.
This kind of stuff happens on any model you train from scratch even before training for multi step reasoning. It seems to happen more when there’s not enough data in the training set, but it’s not an intentional add. Output length is a whole deal.