Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/cpu rope #256

Merged
merged 34 commits into from
Oct 29, 2024
Merged
Changes from 1 commit
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
ab223ca
initial rope setup
ivarflakstad Sep 6, 2024
2534dfb
tidy
ivarflakstad Sep 6, 2024
f7d3a32
chore: refactor cpu gemm for use in rope
ivarflakstad Sep 6, 2024
5f15257
debugging cpu RoPE
ivarflakstad Sep 6, 2024
70dc153
Rope cpu is almost there
ivarflakstad Sep 6, 2024
d1f31da
debugging gemm/rope interaction
ivarflakstad Sep 9, 2024
2366e9d
More debugging gemm/rope
ivarflakstad Sep 9, 2024
9b4bd6c
Revert gemm back tobinary mul
ivarflakstad Sep 9, 2024
f998176
chore: turns out while I was confused, it was not about gemm
ivarflakstad Sep 9, 2024
a09f986
chore: Add interleave_by_offset
ivarflakstad Sep 9, 2024
46861fa
close
ivarflakstad Sep 9, 2024
23f2687
Most RoPE test cases are passing
ivarflakstad Sep 9, 2024
328d8ce
getting there
ivarflakstad Sep 18, 2024
6e39c34
Merge branch 'master' into feature/cpu-rope
ivarflakstad Sep 20, 2024
33b097e
testing a bunch of different things. really messy :)
ivarflakstad Oct 1, 2024
d5fb9f8
chore: focus on theta
FL33TW00D Oct 2, 2024
44bc1ec
chore: theta matches
FL33TW00D Oct 2, 2024
81f4bfc
chore: theta matches
FL33TW00D Oct 2, 2024
82435eb
chore: R1 and R2 match
FL33TW00D Oct 3, 2024
ca5f5a7
chore: cleaning
FL33TW00D Oct 3, 2024
88e7c07
chore: RoPE works but is shit
FL33TW00D Oct 4, 2024
4d63692
chore: RoPE doesn't work
FL33TW00D Oct 4, 2024
1d93205
chore: not quite right
FL33TW00D Oct 4, 2024
572e7d1
chore: rope concat dynamic outs length
ivarflakstad Oct 16, 2024
ce991ba
chore: simplify rope concat
ivarflakstad Oct 16, 2024
67a40c9
chore: padding r1/r2 with 0s works. Not optimal
ivarflakstad Oct 18, 2024
c932fd5
chore: use randn in rope test to avoid precision issues
ivarflakstad Oct 20, 2024
022db5a
chore: remove redundant "outs" vec
ivarflakstad Oct 20, 2024
508b5ed
chore: use iter cycle instead of % check
ivarflakstad Oct 20, 2024
52863d2
Merge branch 'master' into feature/cpu-rope
ivarflakstad Oct 28, 2024
be77442
chore: remove unused strided iterator. may be useful later
ivarflakstad Oct 28, 2024
5a018ce
chore: tidy up
ivarflakstad Oct 29, 2024
99a074e
chore: ? > unwrap
ivarflakstad Oct 29, 2024
b74e4b2
chore: Add back default debug_struct in Debug for Tensor impl
ivarflakstad Oct 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
tidy
  • Loading branch information
ivarflakstad committed Sep 6, 2024
commit 2534dfbd5c5479a3cf3e42172a047927ff27f73e
30 changes: 0 additions & 30 deletions crates/ratchet-core/src/cpu/rope.rs
Original file line number Diff line number Diff line change
@@ -74,33 +74,3 @@ fn rope(src: &[f32], shape: &Shape, dim: usize, base: f32, offset: usize) -> Vec
});
dst
}

fn old_rope(src: &[f32], shape: &Shape, dim: usize, base: f32, offset: usize) -> Vec<f32> {
let cos = src.iter().map(|x| x.cos()).collect::<Vec<f32>>();
let sin = src.iter().map(|x| x.sin()).collect::<Vec<f32>>();

let b = *shape.get(0).unwrap();
let t = *shape.get(1).unwrap();
let h = *shape.get(2).unwrap();
let d = *shape.get(3).unwrap();

let el_count = b * h * t * d;
let mut dst = vec![0.0; el_count];
src.chunks(t * h * d)
.zip(dst.chunks_mut(t * h * d))
.for_each(|(src, dst)| {
for i_t in 0..t {
for i_d in 0..d / 2 {
let i_cs = i_t * (d / 2) + i_d;
for i_h in 0..h {
let i1 = i_t * h * d + i_h * d + i_d;
let i2 = i1 + d / 2;
dst[i1] = src[i1] * cos[i_cs] - src[i2] * sin[i_cs];
dst[i2] = src[i1] * sin[i_cs] + src[i2] * cos[i_cs];
}
}
}
});

dst
}