Newest 'large-language-models' Questions - Artificial Intelligence Stack Exchange

0 votes

0 answers

24 views

Rethinking My Deep-Research Agent Workflow — Should We Move Beyond Static Trees? [closed]

I’m reevaluating a deep-research workflow I built earlier and would love some advice. My previous design used a static tree workflow (fixed width/depth, node = search → extract → summarize → generate ...

Gosh Li

1

asked Nov 28 at 15:46

-1 votes

1 answer

46 views

Missing tokenizer.json file - GGUF Conversion [closed]

When converting mistralai/Mistral-Small-3.2-24B-Instruct-2506 to GGUF (via llama_cpp), I get an error saying the tokenizer.json ...

s3dev

99

asked Nov 25 at 20:08

0 votes

2 answers

104 views

If LLMs like OpenAI / DeepSeek / Gemini exist, why do we still need ML or NLP libraries, now and in the future?

I’m new to AI and NLP, and I’m trying to understand how different tools fit together. Large Language Models (LLMs) like OpenAI, DeepSeek, or Gemini can already handle many NLP tasks text ...

itsdevthen

3

asked Nov 16 at 10:37

3 votes

1 answer

114 views

Why do AI language models overuse em dashes compared to human writers?

I've noticed a consistent pattern in AI-generated text: frequent overuse of em dashes (—), sometimes multiple times in a single paragraph. In contrast, in common human writing—even in the sources AI ...

Harry McKenzie

131

asked Nov 15 at 9:56

0 votes

1 answer

75 views

Finding the right setup for a local LLM for big contexts [closed]

i am trying to figure out what i would need for a setup to do the following task: i have a korean text about 10-20 pages. i need to translate it, anonymize it, and also swap out some words with ...

pcace

101

asked Oct 28 at 14:32

2 votes

2 answers

100 views

Why are LLMs for coding not more focused?

When asking llama3.3:70b about its supported natural and programming languages it lists more than a dozen each. As a user I am usually asking questions in one natural language for one programming ...

coproc

121

asked Oct 25 at 15:13

0 votes

1 answer

23 views

Does Azure OpenAI or Amazon Bedrock Store the data sent via API calls?

I have some client data that is filled with PII information. I want to use Azure or AWS LLM models, but I am afraid they will use this data for further training or send it to some third party. Could ...

Gourav Singh Bais

11

asked Oct 13 at 12:18

0 votes

0 answers

18 views

How can I use structured JSON data from PostgreSQL to populate LightRAG’s Neo4j graph without letting the LLM hallucinate relationships?

I’m working on a hybrid RAG (Retrieval-Augmented Generation) system that combines: Structured data from PostgreSQL A Neo4j graph database LightRAG for hybrid (graph + vector) search I want to use ...

Debug Duckling

1

asked Oct 12 at 18:34

0 votes

0 answers

17 views

Are vector distances still relevant if embeddings are created with the same model but different quantization?

I use bge-m3 model to create embeddings and store them to postgres/pgvector. I am curious if I can: use F16 quantization during data creation and storage. then use Q4_K_M quantization for user search/...

Guix555

1

asked Oct 8 at 13:04

0 votes

1 answer

54 views

Why not scale down over-parametrised models with a low intrinsic dimension

I was just reading the LoRA paper, which states: We take inspiration from Li et al. (2018a); Aghajanyan et al. (2020) which show that the learned over-parametrized models in fact reside on a low ...

Anson Savage

103

asked Oct 3 at 22:52

0 votes

0 answers

74 views

Transformer FLOP accounting during forward pass with ISL / OSL

I'm trying to do some accounting for total inference FLOPs for a single request with input/output sequence length $ISL$ and $OSL$, respectively. In particular, I am trying to account for the extra ...

codeing_monkey

51

asked Oct 3 at 0:36

0 votes

1 answer

67 views

What is the longest text that is represented in one token?

We know that LLMs process tokens and a token is 4 characters in average [Source: OpenAI]. There are also tools like the OpenAI Tokenizer which visualize given text. In German text, there are tokens ...

Thomas Weller

403

asked Sep 26 at 7:58

4 votes

1 answer

73 views

How to extend LLMs with structured data like a word breakdown dictionary?

For fun I like playing with words, and have collected a lot of structured data on words, such as: syllables and syllable counts pronunciations etc.. One case I was thinking about at one point was ...

Lance Pollard

127

asked Sep 11 at 8:18

0 votes

0 answers

17 views

Using PDDL for high-level task planning in a quadruped: is it practical, and how to architect the stack?

I’d like to use PDDL for high-level task planning. The motion layer (locomotion, footstep planning, obstacle avoidance) is handled by a low-level controller and local planners. I’m trying to determine ...

sirius

131

asked Sep 9 at 5:51

-1 votes

1 answer

61 views

How are very deep models designed?

How did we discover the architecture of state of the art large language models?

Alex

99

asked Sep 3 at 21:29

Stack Exchange Network

Questions tagged [large-language-models]

Rethinking My Deep-Research Agent Workflow — Should We Move Beyond Static Trees? [closed]

Missing tokenizer.json file - GGUF Conversion [closed]

If LLMs like OpenAI / DeepSeek / Gemini exist, why do we still need ML or NLP libraries, now and in the future?

Why do AI language models overuse em dashes compared to human writers?

Finding the right setup for a local LLM for big contexts [closed]

Why are LLMs for coding not more focused?

Does Azure OpenAI or Amazon Bedrock Store the data sent via API calls?

How can I use structured JSON data from PostgreSQL to populate LightRAG’s Neo4j graph without letting the LLM hallucinate relationships?

Are vector distances still relevant if embeddings are created with the same model but different quantization?

Why not scale down over-parametrised models with a low intrinsic dimension

Transformer FLOP accounting during forward pass with ISL / OSL

What is the longest text that is represented in one token?

How to extend LLMs with structured data like a word breakdown dictionary?

Using PDDL for high-level task planning in a quadruped: is it practical, and how to architect the stack?

How are very deep models designed?

Hot Network Questions