KTO aligns models using simple binary signals (like/dislike) rather than paired preferences, making data collection significantly cheaper and easier while maintaining high performance.
Named after prospect theory researchers Kahneman and Tversky.
Gaining traction for its efficiency in data collection.