Showing only posts tagged LLM. Show all posts.

Apr 17 2024 Using AI-Generated Legislative Amendments as a Delaying Technique

Canadian legislators proposed 19,600 amendments —almost certainly AI-generated—to a bill in an attempt to delay its adoption. I wrote about many different legislative delaying tactics in A Hacker’s Mind, but this is a new one. [...]

Mar 21 2024 Public AI as an Alternative to Corporate AI

Source

This mini-essay was my contribution to a round table on Power and Governance in the Age of AI. It’s nothing I haven’t said here before, but for anyone who hasn’t read my longer essays on the topic, it’s a shorter introduction. The increasingly centralized control …

Mar 19 2024 AI and the Evolution of Social Media

Source

Oh, how the mighty have fallen. A decade ago, social media was celebrated for sparking democratic uprisings in the Arab world and beyond. Now front pages are splashed with stories of social platforms’ role in misinformation, business conspiracy, malfeasance, and risks to mental health. In a 2022 survey, Americans …

Mar 16 2024 ASCII art elicits harmful responses from 5 major AI chatbots

Source

Enlarge / Some ASCII art of our favorite visual cliche for a hacker. (credit: Getty Images) Researchers have discovered a new way to hack AI assistants that uses a surprisingly old-school method: ASCII art. It turns out that chat-based large language models such as GPT-4 get so distracted trying to …

Mar 12 2024 Jailbreaking LLMs with ASCII Art

Source

Researchers have demonstrated that putting words in ASCII art can cause LLMs—GPT-3.5, GPT-4, Gemini, Claude, and Llama2—to ignore their safety instructions. Research paper. [...]

Mar 11 2024 Using LLMs to Unredact Text

Source

Initial results in using LLMs to unredact text based on the size of the individual-word redaction rectangles. This feels like something that a specialized ML system could be trained on. [...]

Mar 08 2024 A Taxonomy of Prompt Injection Attacks

Source

Researchers ran a global prompt hacking competition, and have documented the results in a paper that both gives a lot of good examples and tries to organize a taxonomy of effective prompt injection strategies. It seems as if the most common successful strategy is the “compound instruction attack,” as …

Mar 07 2024 How Public AI Can Strengthen Democracy

Source

With the world’s focus turning to misinformation, manipulation, and outright propaganda ahead of the 2024 U.S. presidential election, we know that democracy has an AI problem. But we’re learning that AI has a democracy problem, too. Both challenges must be addressed for the sake of democratic …

Mar 04 2024 LLM Prompt Injection Worm

Source

Researchers have demonstrated a worm that spreads through prompt injection. Details : In one instance, the researchers, acting as attackers, wrote an email including the adversarial text prompt, which “poisons” the database of an email assistant using retrieval-augmented generation (RAG), a way for LLMs to pull in extra data from …

Feb 29 2024 How the “Frontier” Became the Slogan of Uncontrolled AI

Source

Artificial intelligence (AI) has been billed as the next frontier of humanity: the newly available expanse whose exploration will drive the next era of growth, wealth, and human flourishing. It’s a scary metaphor. Throughout American history, the drive for expansion and the very concept of terrain up for …

Feb 23 2024 AIs Hacking Websites

Source

New research : LLM Agents can Autonomously Hack Websites Abstract: In recent years, large language models (LLMs) have become increasingly capable and can now interact with tools (i.e., call functions), read documents, and recursively call themselves. As a result, these LLMs can now function autonomously as agents. With the …

Feb 22 2024 New Image/Video Prompt Injection Attacks

Source

Simon Willison has been playing with the video processing capabilities of the new Gemini Pro 1.5 model from Google, and it’s really impressive. Which means a lot of scary new video prompt injection attacks. And remember, given the current state of technology, prompt injection attacks are impossible …

Feb 07 2024 Teaching LLMs to Be Deceptive

Source

Interesting research: “ Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training “: Abstract: Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. If an AI system learned such a deceptive strategy …

Jan 26 2024 Chatbots and Human Conversation

Source

For most of history, communicating with a computer has not been like communicating with a person. In their earliest years, computers required carefully constructed instructions, delivered through punch cards; then came a command-line interface, followed by menus and options and text boxes. If you wanted results, you needed to …

Jan 24 2024 Poisoning AI Models

Source

New research into poisoning AI models : The researchers first trained the AI models using supervised learning and then used additional “safety training” methods, including more supervised learning, reinforcement learning, and adversarial training. After this, they checked if the AI still had hidden behaviors. They found that with specific prompts …

« newer articles | page 2