smartgpt: major benchmark broken - 89.0% on mmlu exam's many errors
Published 10 months ago • 105K plays • Length 26:44Download video MP4
Download video MP3
Similar videos
-
19:20
everything wrong with llm benchmarks (ft. mmlu)!!!
-
5:50
7 popular llm benchmarks explained [openllm leaderboard & chatbot arena]
-
6:46
facet by meta ai - fairness in computer vision evaluation benchmark
-
16:27
ultimate guide to llm benchmarks: mmlu, hellaswag, mbpp, gsm-8k, arc challenge & more!
-
27:42
gpt 4 is smarter than you think: introducing smartgpt
-
18:50
mistral:8x7b开源moe击败llama 2逼近gpt-4!首个开源moe大模型发布!也是首个能够达到gpt-3.5水平的开源大模型(李开复的大模型yi-34b排行超过了llama2-70)
-
10:06
【人工智能】ai芯片竞赛加速 | intel/amd/google同时发布新款 | gaudi 3 | versal 2代 | axion | tpu v5p
-
13:14
史上intel和amd主流迷你电脑最低差价!仅仅100元!beelink零刻 sei14 ultra 5 125h迷你电脑评测!对比7840hs|7940hs|8845hs
-
14:25
integrate emacs with chatgpt or any llm - an intro to the gptel package
-
9:09
ml perf v0.7 results released -- nvidia breaks 16 ai performance records
-
0:30
the difference between gpt-3.5 and gpt- 4 #openai #chatgpt
-
12:57
smartgpt: make chatgpt smarter
-
9:16
[iui'21] a human-grounded evaluation benchmark for local explanations of machine learning
-
1:30
baseline models and benchmark datasets explained
-
1:22:19
kicking off examining llm benchmarks with mmlu
-
13:48
ai benchmark for measuring machine learning performance
-
5:28
(part 1) smartgpt: one man's innovation outperforms openai's gpt-4
-
0:49
what is mmlu?
-
0:32
unveiling meta's dominance lama iii's superior performance in mmlu benchmark
-
4:02
bme ai-studio: evaluate and deploy algorithms for the bme688 gas sensor