RE: LeoThread 2024-11-17 10:12

“The key point of our work is that there are limitations you cannot naïvely get around,” Kumar concluded. “We hope our work adds nuance to the discussion that often seeks increasingly low precision defaults for training and inference.”

Kumar acknowledges that his and his colleagues’ study was at relatively small scale — they plan to test it with more models in the future. But he believes that at least one insight will hold: There’s no free lunch when it comes to reducing inference costs.