• kromem@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    11 months ago

    It depends on which stage of training. As the recent Anthropic research showed, fine tuning out behavior isn’t so easy.

    And at the pretrained layer you really can’t get any halfway decent results with limited data sets, so you’d only be able to try to bias it at the fine tuned layer with biased sourcing, but then per the Anthropic findings (and the real world cases I mentioned above) you are only biasing a thin veneer over the pretrained layer.