

6·
2 months agoLooking up similar images and searching for crops are computer vision topics, not large language model (basically text predictor) or image generation ai topics.
Image hashing has been around for quite a while now and there is crop resistant image hashing libraries readily available like this one: https://pypi.org/project/ImageHash/
It’s basically looking for defining features in images and storing those in an efficient searchable way probably in a traditional database. As long as they are close enough or in the case of a crop, a partial match, it’s a similar image.
The usual trick is hdparm I guess? For me with a 8 disk raidz2 pool I found that playing a movie from that might put it to sleep between reads, because the longest timeout is a bit short. I’ve been using hd-idle with a 30 minuten timeout because of that for quite a few years already which has worked quite nice for me.
The only issue I’ve run into is that smart data reads count as activity, so make sure that any smart data software has a long timeout between reads and is configured to not wake disks.