分类筛选
找到 1335 个 Demo
We asked 4 frontier coding agents to build the sam...
We asked 4 frontier coding agents to build the same Unreal 3D city scene in SimWorld Studio. Same prompt. Different worlds 👀 Claude Code + Opus 4.7 Codex + GPT-5.5 Cursor + Composer 2.5 OpenCode + Gemini 2.5 Pro Who wins? https://t.co/gXohsyhhkf
“最大的差异,其实是数据。” 谷歌前核心科学家Andrew Dai告诉我们,Gemini 2.5 和...
“最大的差异,其实是数据。” 谷歌前核心科学家Andrew Dai告诉我们,Gemini 2.5 和 3.0 的提升,本质上来自数据处理方法的迭代:更严格的数据筛选、更好的数据质量、更长上下文。但最后,他还是离开了谷歌。 原因很直接:几千人、海量 GPU、超大规模预训练项目,意味着没人敢真正冒险。 “他们不能用很激进、很不同的新想法。”当一家公司的规模大到一定程度,稳定本身,就会成为创新最大的阻力。
I built an open-source surgical AI benchmark a few...
I built an open-source surgical AI benchmark a few months ago and never got around to publishing it. Life got in the way. Finally pushing it out today. Surg Bench tests 16 AI models on cases from "Surgical Exam Cases" by Charles Tan, a textbook published in 2025. Every model tested was trained before this book came out, so none of them could have seen this material during training. That's the whole point. Pure medical reasoning, zero memorization. Important caveat though. Newer models released after late 2025 might have this book in their training data, so this benchmark has a shelf life. It's a snapshot of where things stood This is NOT multiple choice. Every case is open-ended, the model writes a full surgical viva-style answer, and then two independent LLM graders (GPT-5 Mini and Gemini 2.5 Flash) score each response against the reference answer on accuracy, completeness, and clinical reasoning from 0 to 1. 290 cases, but many contain multiple numbered tasks adding up to 1,249 sub-prompts across 14 surgical specialties The results Google is dominating surgical reasoning right now. Gemini 3 Flash leads everything at 0.882 and it's not even their biggest model. Gemini 3 Pro, Gemini 2.5 Pro, and GPT-5.2 are all clustered around 0.87. Four of the top five spots belong to Google. GPT-5.2 is the most reliable model at 0.867 with literally zero refusals. Every question answered. The weird stuff. GPT-5.1 refuses to answer 25% of urology cases but when it does answer them it scores 0.92. Something in its safety filters is misfiring hard on that specialty specifically. Claude Sonnet 4.5 sits at 0.693 and tanks on plastics at 0.43. Anthropic's models are notably behind the frontier here. Gemini 2.5 Flash has the highest refusal rate at 25% but when it actually answers it scores 0.84 which is competitive with the top models. Massive penalty from safety over-refusal. The category heatmap below tells the real story. No single model wins everywhere
昨日は #ジェミラン 3'30設定で2000だったが、 3'26->3'30 という結果...
昨日は #ジェミラン 3'30設定で2000だったが、 3'26->3'30 という結果。数字的には目標クリア。突っ込み気味に入って後半粘っているように見えるが、実態は垂れてるだけ。 MKは3'16で走らないといけないが、いけるか… #Gemini3 #まるお製作所RC https://t.co/ppVbsMFUdN
日曜日このペースであと3000行かなきゃいけないんか!少なくともスパイクじゃ脚が持たないことに気付け...
日曜日このペースであと3000行かなきゃいけないんか!少なくともスパイクじゃ脚が持たないことに気付けたのはいい収穫 #ジェミラン #Gemini3 https://t.co/rmmRd6rwoD
毎週水曜日の #Gemini3 練習会 メニュー: ・2000m 走(設定:03:20 / km ぐ...
毎週水曜日の #Gemini3 練習会 メニュー: ・2000m 走(設定:03:20 / km ぐらいを目標に) ・5000m テンポラン 2000m 走は全体としては設定通りなんだけど、中身は最初だいぶ速く入って、後半垂れてトントンといった内容… 後半もうちょっと頑張りたい #GeminiRunners #まるお製作所RC #ジェミラン https://t.co/GLfpssw9HH
I built an app without writing a single line of co...
I built an app without writing a single line of code manually here's the 3 tool combo I used Experimenting with a new workflow for building apps, this time using Claude + Stitch + Google AI Studio 🔥 Here's why I picked these 3 tools and how they work together: 1. Claude → I start here. I describe what kind of app I want in this case, an AI powered object scanner app called ScanLens AI. I told Claude the concept, the features, the user flow, everything. Claude then turned all of that into a clean, structured design prompt ready to be used in the next tool. 2. Stitch (by Google) → I paste Claude's prompt into Stitch, and it automatically generates the full UI layouts, components, color schemes, dark mode, all of it. This is how ScanLens AI got its sleek dark purple interface without me touching a single design tool manually. 3. Google AI Studio → After the design is ready, I export it directly into AI Studio.This is where it gets wild it understands the app logic, executes the features, and I can test it live on my phone instantly No manual deployment needed. The result? ScanLens AI an app that can scan any object, detect products with up to 98% accuracy, pull product details, pricing, reviews, and give AI powered insights All from a single photo Still just an experiment, but honestly the output is kind of insane AI is really built different now Have you ever tried building something with this workflow or something similar? Drop it below I'm still learning and would love to hear how others are doing it 🙌
🚨 son dakika : chat gpt’nin 5 ücretsiz alternatif...
🚨 son dakika : chat gpt’nin 5 ücretsiz alternatifi. 1. Gemini 2.5 Flash 2. DeepSeek V3.2 3. Qwen3 235B 4. Claude Sonnet (free tier) 5. Llama 4 Maverick
火曜はトライアスロンチームSWIM練 ▷メニュー P100m×7 2'00 パドル S25m×8 4...
火曜はトライアスロンチームSWIM練 ▷メニュー P100m×7 2'00 パドル S25m×8 40" eyes up S25m×2 40" 50m×1 1'20 2set speed S50m×8 1'00 少しずつ戻ってきた🏊♂️ やはり継続が大事☝️ スイム後はサウナで暑熱順化💦 明日の #Gemini3 練習会で調整終了🏃 ニューユニフォーム楽しみ😄 #GeminiRunners https://t.co/m8ODWCUYWh
GOOGLE AI STUDIO JUST MADE CODING LOOK SLOW Most...
GOOGLE AI STUDIO JUST MADE CODING LOOK SLOW Most people will miss the real SEO opportunity here. Google AI Studio + Gemini 3.5 Flash: → Builds real Android apps from plain text → Runs inside your browser with no setup → Lets you test the app on a built-in fake phone screen → Uses Gemini 3.5 Flash, which Google says beats Gemini 3.1 Pro on coding and agent tasks The SEO angle nobody is talking about: ✓ Build a keyword tracker from one prompt ✓ Pull keyword data from Google Sheets ✓ Organize content ideas from Google Drive ✓ Create landing page prototypes in minutes The big lesson: AI can help you build faster. But SEO is how people actually find what you build.
BREAKING 🚨: Google AI Studio got a new feature "S...
BREAKING 🚨: Google AI Studio got a new feature "Starter Apps". Video Analyser and Map Explorer. These are example projects built on top of Gemini API you can play with and Clone on Github 👀 https://t.co/Hy3Xxi95Xw
Gemini 2.5 Pro 之前,Google 模型没有超过 GPT-4 的好么 现在 Gemin...
Gemini 2.5 Pro 之前,Google 模型没有超过 GPT-4 的好么 现在 Gemini 又开始掉队了……