Discord
Login
Community
DARK THEME

@Tiberius2

I am gonna attempt to make a smart AI and i have some questions since yours is really good

  1. What architecture did you use? Transformer, RNN, CNN, hybrid?
  2. How many parameters does your model have?
  3. What is the embedding dimension size?
  4. How many attention heads per layer?
  5. What positional encoding method did you implement?
  6. Did you use pre-norm or post-norm layer normalization?
  7. What activation function are you using (ReLU, GELU, SwiGLU)?
  8. How did you initialize the weights?

Also I usually have difficulty training an AI

  1. What dataset did you train it on?
  2. How many tokens total were used for training?
  3. Did you pretrain from scratch or fine-tune an existing checkpoint?
  4. What optimizer did you use (AdamW, SGD, etc.)?
  5. What was your learning rate schedule?
  6. What batch size did you use?
  7. What hardware did you train on? (GPU model?)
  8. How long did training take?
  9. What was the final training loss and validation loss?

Well since I used the base44 app to make it, and used Claude Code to help me out, I will answer all of these:

Architecture: Transformer. Parameters: Billions. Embedding dimension size: High-dimensional. Attention heads: Multi-head attention. Positional encoding: Sophisticated positional encoding. Layer normalization: Pre-norm or post-norm, depending on the model variant. Activation function: GELU or similar. Weight initialization: Robust initialization (like Xavier/Kaiming variants).

Training:

Dataset: Massive, diverse text and code corpus (from Claude Code). Tokens total: Trillions. (the total or the limit per message? if per message it is 1 million) Pretrain/fine-tune: Pre-trained from scratch (help from Claude Code). Optimizer: AdamW. Learning rate schedule: Warm up then decay. Batch size: Very large, distributed. Hardware: Base44 AI platform's specialized compute infrastructure. Training duration: A little over a month. Losses: Highly converged, amazing generalization. (Still working on it because it still has issues)

for training since you did it so fast what database did you use to train your ai?

Post a reply

Progress

Status

Preview
Cancel
Post
Validate your e-mail address to participate in the community