Ray Kurzweil: AI will soon become indistinguishable from conscious beings. While it's hard to tell now, as AI continues to exhibit all the signs of consciousness, we'll eventually accept it as reality. The delay won't be long.
Ray Kurzweil: AI will soon become indistinguishable from conscious beings. While it's hard to tell now, as AI continues to exhibit all the signs of consciousness, we'll eventually accept it as reality. The delay won't be long. The year 2026 will be marked by a resurgence of new
This at first seems deep, but collapses under just a very basic scrutiny. Why would machines only be producers, why can’t they be consumers as well? And in the world of limited resources (energy, computing power, etc.), we’ll still need mechanisms to allocate those resources… https://t.co/9oMZznqTCS
This at first seems deep, but collapses under just a very basic scrutiny. Why would machines only be producers, why can’t they be consumers as well? And in the world of limited resources (energy, computing power, etc.), we’ll still need mechanisms to allocate those resources
GE-Sim 2.0 shows how simulation is evolving toward embodied intelligence, combining video generation, state estimation and policy evaluation in one framework
GE-Sim 2.0 shows how simulation is evolving toward embodied intelligence, combining video generation, state estimation and policy evaluation in one framework A relevant step for scalable robot learning #AGIBOT Partner #AGIBOTAIWeek #GenieEnvisioner #Robotics #visualsimulation
Really well articulated. The distinction between "finds the right fact when asked" vs "already changed behavior from experience" is the real gap. Retrieval is table stakes. Consolidation turning episodic traces into behavioral defaults is where agents actually start learning.
Henry Shevlin – a philosopher of mind and AI ethics from Cambridge, just got hired as an in-house philosopher at Google DeepMind. He'll be focusing on machine consciousness, human-AI interaction, and the ethical governance of increasingly autonomous systems. What's significant here: DeepMind is treating philosophy as a discipline on par with computer science and neuroscience, embedding it directly into core research rather than just keeping ethicists as external advisors. The labs are starting to think about the consciousness, agency, and moral reasoning question. Whereas, I am working at the applied human level – what happens when a mid-level manager doesn't trust the AI their company just deployed, or when a team's workflows break because no one designed the adoption path. That's not the philosophy angle but rather organisational and psychological infrastructure. Both matter. [Translated from EN to English]
Simply retrieving a reasoning trace looks a lot like human reasoning, until it's time to navigate uncharted territory. If you memorized all reasoning traces of humans from 10,000 BC, you could automate their lives but you could not invent modern civilization.
The staged approach works because each stage is a quality gate. You're not trusting the output, you're verifying it at multiple checkpoints before it goes anywhere.
GAIA: An LLM Benchmark. Large Language Models (LLMs) herald a new era for artificial general intelligence general-purpose systems, showcasing remarkable fluency, extensive knowledge, and a notable alignment with human preferences. These advanced models can be augmented with powerful tools like Hyperbrowser, web browsers and code interpreters, operating effectively in zero or few-shot scenarios. Despite these advancements, evaluating their performance remains a formidable challenge. As LLMs continue to evolve, they are rapidly surpassing traditional AI benchmarks at an unprecedented pace. In pursuit of more demanding evaluations, the prevailing trend is toward identifying tasks that not only pose significant challenges for humans but also stretch the capabilities of LLMs. That includes complex educational assessments in fields such as STEM and Law, or even ambitious endeavors like crafting a coherent book. However, it’s crucial to recognize that tasks difficult for humans may not equate to similar challenges for these cutting-edge systems. For instance, benchmarks like MMLU and GSM are nearing resolution, likely due to the rapid advancements in LLM technology coupled with potential data contamination impacts. Moreover, open-ended generation necessitates a paradigm shift in evaluation methods, often relying on human or model-based assessments. As task complexity escalates—evident in longer outputs or specialized skills—the feasibility of human evaluation diminishes. How can we assess a book generated by AI or evaluate solutions to intricate math problems that are beyond the grasp of most experts? Conversely, model-based evaluations are inherently limited; they depend on prior models that may not adequately assess new state-of-the-art models and can introduce subtle biases, such as favoring the initial choice presented. In summary, as we advance into uncharted territories of AI capabilities, it is imperative to innovate our assessment frameworks to ensure they accurately reflect the profound potential of these transformative technology. That's where GAIA reigns. #BigData #Analytics #DataScience #AI #MachineLearning #NLProc #IoT #IIoT #PyTorch #Python #RStats #TensorFlow #Java #JavaScript #ReactJS #GoLang #CloudComputing #Serverless #DataScientist #Linux #Programming #Coding #100DaysofCode References Mialon, G., Fourrier, C., Swift, C., Wolf, T., LeCun, Y., & Scialom, T. (2023, November 21). GAIA: A benchmark for General AI Assistants. arXiv. doi.org/10.48550/arXiv.2311.…