• CovfefeKills@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    edit-2
    2 days ago

    I am thoroughly impressed with the quality of the local gemma 3 model, and these are improved weekly pretty much. About the scene, the tortoise is seemingly normal sized. The house ontop the tortoise is seemingly normal sized. Scale is a particular challenge with this scene with these conflicting normals and I guess AI chooses the house to be accessible by normal sized humans and that is why the AI decides to label the tortoise as gigantic but for all we know, the tortoise is standard and mini humans inhabit the house.

    Oh the concept comes from tortoise that hibernate in shallow ponds accumulating dirt and pond plants on their shells. They are like majestic swimming islands and that is where the miniworld on their shell idea comes from. I think Gemma 3 27b can mask 3d objects in images it might be the goto API model for cost effective vision tasks (google removed their image demo thing so i cannot confirm but i feel like i remember being impressed by the 27b model for vision tasks).