๐ AIๆ็ดข & ไปฃ็
๐
ไธป้กต
Sign in
chromium
/
external
/
github.com
/
google
/
XNNPACK
/
HEAD
79a6eac
_mm256_packs_epi* is AVX2, not AVX
by Dillon Sharlet
ยท 9 hours ago
upstream/master
6400256
Fix windows build flags
by Dillon Sharlet
ยท 16 hours ago
1e2d4d0
Added convert<bf16x8, f32x8> to x86_avx2 to decrease intrinsics usage in reduce kernels.
by Misha Gutman
ยท 18 hours ago
31b1ede
simd test improvements
by Dillon Sharlet
ยท 20 hours ago
3c9ad05
Fix uses of AVX2 instructions in AVX targeted code
by Dillon Sharlet
ยท 20 hours ago
aabc21c
Automated Code Change
by XNNPACK Team
ยท 33 hours ago
269fbf1
Use a more intuitive formulation of `make_stencil_dim`
by Dillon Sharlet
ยท 2 days ago
be53d66
Fixed int2 fully connected setting producers in the beginning of row_sum rewrite.
by Misha Gutman
ยท 2 days ago
b666baa
Templated neondot int2 GEMMs.
by Misha Gutman
ยท 2 days ago
dfaf2ae
Added scalar kernels for int2 gemm.
by Misha Gutman
ยท 2 days ago
aa67973
Re-enable kernels disabled under msan due to msan bugs
by Dillon Sharlet
ยท 2 days ago
dc05a09
Add @bazel_tools//tools/cpp/compiler:clang-cl as a signal to target MSVC-style builds
by Dillon Sharlet
ยท 2 days ago
631f179
Use _mm_packs_epi32 instead of _mm_packus_epi32 for converting to uint8
by Dillon Sharlet
ยท 3 days ago
0852723
Give ternary nodes a proper type
by Dillon Sharlet
ยท 3 days ago
470fb9c
Fix unused variable warning in `pack-lh-config.c`.
by Quentin Khan
ยท 4 days ago
ff3f40a
Move f16 sum/sum_squared kernels to avx512bw
by Dillon Sharlet
ยท 4 days ago
c187017
Speed up reduce kernels test
by Dillon Sharlet
ยท 4 days ago
81bb443
Minor readability and refactors for reductions
by Dillon Sharlet
ยท 4 days ago
e7efd4a
Fix out of bounds memory access when transposed A kernels have tile_m != 1
by Dillon Sharlet
ยท 4 days ago
84cdd30
Fix avx512fp16 sum to only add each input once.
by Dillon Sharlet
ยท 4 days ago
3e588fd
Added bf16 sum and sum_squared to sse2.
by Misha Gutman
ยท 4 days ago
8ff7ddb
Added bf16 sum and sum_squared to avx512.
by Misha Gutman
ยท 4 days ago
4d0cba7
Added bf16 sum and sum_squared to avx2.
by Misha Gutman
ยท 4 days ago
4f4cfcc
Move bias add to after the dot operation.
by XNNPACK Team
ยท 4 days ago
940155d
Update xnnpack shim
by XNNPACK Team
ยท 4 days ago
85468e7
Partially enable convolution and fully_connected subgraph tests when YNNPACK is enabled
by Dillon Sharlet
ยท 4 days ago
09e2d3e
Fix hole in reduce kernel test coverage
by Dillon Sharlet
ยท 5 days ago
59eb057
Fix some regressions in fully connected test
by Dillon Sharlet
ยท 5 days ago
a342cf6
Minor reduction cleanups
by Dillon Sharlet
ยท 6 days ago
995241f
Added bf16 sum and sum_squared to arm neon.
by Misha Gutman
ยท 7 days ago
9bab77f
Merge pull request #9244 from JonathanC-ARM:jclohess_sme2_pf32_igemm
by XNNPACK Team
ยท 7 days ago
0399c7d
Added sum_squared reductions to arm.
by Misha Gutman
ยท 7 days ago
1aef20a
Add neon-bf16 dot kernels
by Dillon Sharlet
ยท 7 days ago
3e084f0
Refactor ifdef to be inline with sme1 variant
by Jonathan Clohessy
ยท 7 days ago
fd03d06
Merge branch 'google:master' into jclohess_sme2_pf32_igemm
by Jonathan Clohessy
ยท 7 days ago
3c43700
Clean up SIMD headers and improve test coverage
by Dillon Sharlet
ยท 7 days ago
92813a3
Move code into ifdef guards and add sme2 packing variant for lhs
by Jonathan Clohessy
ยท 7 days ago
149835f
Added sum_squared reductions for more AVX variants.
by Misha Gutman
ยท 8 days ago
4a84acb
Added sum_squared for AVX512BW. Used multi_vec instead of s32x16x4.
by Misha Gutman
ยท 8 days ago
2a8e706
Added sum_squared for avx512bf16.
by Misha Gutman
ยท 8 days ago
d5affbf
Change ARM i8mm to be a transpose_a kernel
by Dillon Sharlet
ยท 8 days ago
6f79f4d
Don't try to optimize if static reshape's new shape isn't fully defined.
by Quentin Khan
ยท 8 days ago
549cca9
Fixed 2-bit gemm benchmark. Number of elements in packed weights didn't account for row_sum.
by Misha Gutman
ยท 8 days ago
7f398de
Added Map to ynnpack reductions. Added sum_squared for avx2.
by Misha Gutman
ยท 8 days ago
7884caf
Disable inlining of functions that attempt to disable sanitizers
by Dillon Sharlet
ยท 9 days ago
c18f402
Fix build when `XNN_ENABLE_KLEIDIAI` is false
by Dillon Sharlet
ยท 9 days ago
6d6315c
Test ynnpack on github
by Dillon Sharlet
ยท 9 days ago
c2b2b6f
Remove --enable-debug-tcg from qemu
by Dillon Sharlet
ยท 9 days ago
33cd488
Reduce the size of the dots in `consistent_arithmetic_test`
by Dillon Sharlet
ยท 9 days ago
b6c6ec5
Release temporary buffer in `xnn_create_fully_connected_nc_qd8_f32_qb4w_f16_scales`.
by Quentin Khan
ยท 9 days ago
fc3fc4e
Merge pull request #9208 from qualcomm:sme1/f16-gemm-igemm
by XNNPACK Team
ยท 9 days ago
1dd42d7
Deduplicate pack_b and transpose_a test helpers, and use the dot packer
by Dillon Sharlet
ยท 9 days ago
51eadbe
Optimize reduce kernel test
by Dillon Sharlet
ยท 9 days ago
57c3b8e
Improve reference implementation of quantization
by Dillon Sharlet
ยท 10 days ago
7c72232
Set -ffp-contract=off
by Dillon Sharlet
ยท 10 days ago
8798142
Fix warnings on some compilers
by Dillon Sharlet
ยท 10 days ago
e8dafff
Fix slicing of fused dimensions to adjust the min
by Dillon Sharlet
ยท 10 days ago
27dc881
Implement pf32 sme2 igemm
by Jonathan Clohessy
ยท 10 days ago
9d32140
Move fuse and slice helper function to slinky.h
by XNNPACK Team
ยท 10 days ago
6e93209
Merge pull request #8811 from qualcomm:pf32-igemm-sme1
by XNNPACK Team
ยท 10 days ago
db9a35e
Don't recompute the fingerprint id from the operator type when creating a fully connected operation.
by Quentin Khan
ยท 10 days ago
7be320a
Cleanup fingerprinting for deconvolution-nhwc.
by Quentin Khan
ยท 10 days ago
4bff83d
Correctly cleanup data when creating fully connected fingerprint.
by Quentin Khan
ยท 10 days ago
e3c63b0
Fingerprint deconvolution-nhwc kernels packing operations.
by Quentin Khan
ยท 10 days ago
872fcb0
Use average pooling instead of convolution as a "dummy stencil"
by Dillon Sharlet
ยท 10 days ago
56573c2
Apply inline lhs pack only for convolution2d node
by Vaisakh K V
ยท 10 days ago
3106129
Use average pooling instead of convolution as a "dummy stencil"
by Dillon Sharlet
ยท 10 days ago
eaf17a4
Added pf32 igemm kernel to fingerprint method
by Vaisakh K V
ยท 10 days ago
cfb5bc1
Create single function for dim peeling and fusion
by XNNPACK Team
ยท 10 days ago
3fa93a2
Merge remote-tracking branch 'google/master' into pf32-igemm-sme1
by Vaisakh K V
ยท 10 days ago
4b3f79d
build: allow generated files to be read by GN and Bazel
by Richard Townsend
ยท 11 days ago
e359448
Minor code refactor of elementwise.cc
by XNNPACK Team
ยท 11 days ago
d52fe00
Re-enable tile_k = 1 x86 kernel for msan
by Dillon Sharlet
ยท 11 days ago
ba0a32d
When fingerprinting avoid generating values that will lead to undefined behaviour.
by Quentin Khan
ยท 11 days ago
92f1973
Refactor fully connected create functions.
by Quentin Khan
ยท 11 days ago
7f2dbd0
Optimize buffer handling for ternary ops
by XNNPACK Team
ยท 11 days ago
e60d24e
Change binary op to use raw_buffer
by XNNPACK Team
ยท 11 days ago
f13e1b9
Optimize buffer handling for unary ops
by XNNPACK Team
ยท 11 days ago
457827b
Skip unnecessary add for avx512bf16 reductions
by Dillon Sharlet
ยท 14 days ago
4d17030
Updated copy right year
by Vaisakh K V
ยท 2 weeks ago
b11417e
Added SME1 benchmark for xnn_pf16_gemm_minmax_ukernel_32x32c2__neonsme
by Vaisakh K V
ยท 2 weeks ago
c17fd58
SME1 support for fp16 GEMM and IGEMM
by Vaisakh K V
ยท 2 weeks ago
e890d51
build: add a basic DEPS file for GN
by Richard Townsend
ยท 2 weeks ago
2ce9fa9
Merge pull request #9178 from qualcomm:sme1/f32_int8_int8
by XNNPACK Team
ยท 2 weeks ago
1b918df
Update Slinky version.
by Quentin Khan
ยท 2 weeks ago
ea72bb3
Added license header
by Vaisakh K V
ยท 2 weeks ago
62bfb5a
Don't include a file from `src` in `xnnpack.h`.
by Quentin Khan
ยท 2 weeks ago
a559de9
Updated Kleidiai
by Vaisakh K V
ยท 2 weeks ago
79192a4
Pass down the fingerprint key in convolution kernels.
by Quentin Khan
ยท 2 weeks ago
4fd530d
Updated Kleidiai version to pull the fixed matmul_clamp_f32_qai8dxp_qsi8cxp SME1 variant
by Vaisakh K V
ยท 2 weeks ago
61fc839
Fixed syntax error in yaml and regenerated the test case
by Vaisakh K V
ยท 2 weeks ago
7b0b9f7
Changing to uint2 computations in kernels.
by Misha Gutman
ยท 2 weeks ago
3c0e6a3
Fingerprint convolution-nhwc kernels packing operations.
by Quentin Khan
ยท 2 weeks ago
fd14fbd
Fingerprint convolution-nchw kernels packing operations.
by Quentin Khan
ยท 2 weeks ago
b3fea56
Don't assume that shapes are correct when doing the static shape propagation.
by Quentin Khan
ยท 2 weeks ago
454b1b5
Add `dot_flag::consistent_arithmetic` to the basic kernels when there are no other kernels
by Dillon Sharlet
ยท 3 weeks ago
d88e825
Remove `build_config.h` header
by Dillon Sharlet
ยท 3 weeks ago
b19450a
Add check for contiguity of output buffers
by XNNPACK Team
ยท 3 weeks ago
992770f
Fix race condition in fingerprinting
by Dillon Sharlet
ยท 3 weeks ago
d3efd0a
Add rounding for conversion of fp32 to bf16
by Dillon Sharlet
ยท 3 weeks ago
Next »