๐ŸŒ AIๆœ็ดข & ไปฃ็† ไธป้กต
  1. 79a6eac _mm256_packs_epi* is AVX2, not AVX by Dillon Sharlet ยท 9 hours ago upstream/master
  2. 6400256 Fix windows build flags by Dillon Sharlet ยท 16 hours ago
  3. 1e2d4d0 Added convert<bf16x8, f32x8> to x86_avx2 to decrease intrinsics usage in reduce kernels. by Misha Gutman ยท 18 hours ago
  4. 31b1ede simd test improvements by Dillon Sharlet ยท 20 hours ago
  5. 3c9ad05 Fix uses of AVX2 instructions in AVX targeted code by Dillon Sharlet ยท 20 hours ago
  6. aabc21c Automated Code Change by XNNPACK Team ยท 33 hours ago
  7. 269fbf1 Use a more intuitive formulation of `make_stencil_dim` by Dillon Sharlet ยท 2 days ago
  8. be53d66 Fixed int2 fully connected setting producers in the beginning of row_sum rewrite. by Misha Gutman ยท 2 days ago
  9. b666baa Templated neondot int2 GEMMs. by Misha Gutman ยท 2 days ago
  10. dfaf2ae Added scalar kernels for int2 gemm. by Misha Gutman ยท 2 days ago
  11. aa67973 Re-enable kernels disabled under msan due to msan bugs by Dillon Sharlet ยท 2 days ago
  12. dc05a09 Add @bazel_tools//tools/cpp/compiler:clang-cl as a signal to target MSVC-style builds by Dillon Sharlet ยท 2 days ago
  13. 631f179 Use _mm_packs_epi32 instead of _mm_packus_epi32 for converting to uint8 by Dillon Sharlet ยท 3 days ago
  14. 0852723 Give ternary nodes a proper type by Dillon Sharlet ยท 3 days ago
  15. 470fb9c Fix unused variable warning in `pack-lh-config.c`. by Quentin Khan ยท 4 days ago
  16. ff3f40a Move f16 sum/sum_squared kernels to avx512bw by Dillon Sharlet ยท 4 days ago
  17. c187017 Speed up reduce kernels test by Dillon Sharlet ยท 4 days ago
  18. 81bb443 Minor readability and refactors for reductions by Dillon Sharlet ยท 4 days ago
  19. e7efd4a Fix out of bounds memory access when transposed A kernels have tile_m != 1 by Dillon Sharlet ยท 4 days ago
  20. 84cdd30 Fix avx512fp16 sum to only add each input once. by Dillon Sharlet ยท 4 days ago
  21. 3e588fd Added bf16 sum and sum_squared to sse2. by Misha Gutman ยท 4 days ago
  22. 8ff7ddb Added bf16 sum and sum_squared to avx512. by Misha Gutman ยท 4 days ago
  23. 4d0cba7 Added bf16 sum and sum_squared to avx2. by Misha Gutman ยท 4 days ago
  24. 4f4cfcc Move bias add to after the dot operation. by XNNPACK Team ยท 4 days ago
  25. 940155d Update xnnpack shim by XNNPACK Team ยท 4 days ago
  26. 85468e7 Partially enable convolution and fully_connected subgraph tests when YNNPACK is enabled by Dillon Sharlet ยท 4 days ago
  27. 09e2d3e Fix hole in reduce kernel test coverage by Dillon Sharlet ยท 5 days ago
  28. 59eb057 Fix some regressions in fully connected test by Dillon Sharlet ยท 5 days ago
  29. a342cf6 Minor reduction cleanups by Dillon Sharlet ยท 6 days ago
  30. 995241f Added bf16 sum and sum_squared to arm neon. by Misha Gutman ยท 7 days ago
  31. 9bab77f Merge pull request #9244 from JonathanC-ARM:jclohess_sme2_pf32_igemm by XNNPACK Team ยท 7 days ago
  32. 0399c7d Added sum_squared reductions to arm. by Misha Gutman ยท 7 days ago
  33. 1aef20a Add neon-bf16 dot kernels by Dillon Sharlet ยท 7 days ago
  34. 3e084f0 Refactor ifdef to be inline with sme1 variant by Jonathan Clohessy ยท 7 days ago
  35. fd03d06 Merge branch 'google:master' into jclohess_sme2_pf32_igemm by Jonathan Clohessy ยท 7 days ago
  36. 3c43700 Clean up SIMD headers and improve test coverage by Dillon Sharlet ยท 7 days ago
  37. 92813a3 Move code into ifdef guards and add sme2 packing variant for lhs by Jonathan Clohessy ยท 7 days ago
  38. 149835f Added sum_squared reductions for more AVX variants. by Misha Gutman ยท 8 days ago
  39. 4a84acb Added sum_squared for AVX512BW. Used multi_vec instead of s32x16x4. by Misha Gutman ยท 8 days ago
  40. 2a8e706 Added sum_squared for avx512bf16. by Misha Gutman ยท 8 days ago
  41. d5affbf Change ARM i8mm to be a transpose_a kernel by Dillon Sharlet ยท 8 days ago
  42. 6f79f4d Don't try to optimize if static reshape's new shape isn't fully defined. by Quentin Khan ยท 8 days ago
  43. 549cca9 Fixed 2-bit gemm benchmark. Number of elements in packed weights didn't account for row_sum. by Misha Gutman ยท 8 days ago
  44. 7f398de Added Map to ynnpack reductions. Added sum_squared for avx2. by Misha Gutman ยท 8 days ago
  45. 7884caf Disable inlining of functions that attempt to disable sanitizers by Dillon Sharlet ยท 9 days ago
  46. c18f402 Fix build when `XNN_ENABLE_KLEIDIAI` is false by Dillon Sharlet ยท 9 days ago
  47. 6d6315c Test ynnpack on github by Dillon Sharlet ยท 9 days ago
  48. c2b2b6f Remove --enable-debug-tcg from qemu by Dillon Sharlet ยท 9 days ago
  49. 33cd488 Reduce the size of the dots in `consistent_arithmetic_test` by Dillon Sharlet ยท 9 days ago
  50. b6c6ec5 Release temporary buffer in `xnn_create_fully_connected_nc_qd8_f32_qb4w_f16_scales`. by Quentin Khan ยท 9 days ago
  51. fc3fc4e Merge pull request #9208 from qualcomm:sme1/f16-gemm-igemm by XNNPACK Team ยท 9 days ago
  52. 1dd42d7 Deduplicate pack_b and transpose_a test helpers, and use the dot packer by Dillon Sharlet ยท 9 days ago
  53. 51eadbe Optimize reduce kernel test by Dillon Sharlet ยท 9 days ago
  54. 57c3b8e Improve reference implementation of quantization by Dillon Sharlet ยท 10 days ago
  55. 7c72232 Set -ffp-contract=off by Dillon Sharlet ยท 10 days ago
  56. 8798142 Fix warnings on some compilers by Dillon Sharlet ยท 10 days ago
  57. e8dafff Fix slicing of fused dimensions to adjust the min by Dillon Sharlet ยท 10 days ago
  58. 27dc881 Implement pf32 sme2 igemm by Jonathan Clohessy ยท 10 days ago
  59. 9d32140 Move fuse and slice helper function to slinky.h by XNNPACK Team ยท 10 days ago
  60. 6e93209 Merge pull request #8811 from qualcomm:pf32-igemm-sme1 by XNNPACK Team ยท 10 days ago
  61. db9a35e Don't recompute the fingerprint id from the operator type when creating a fully connected operation. by Quentin Khan ยท 10 days ago
  62. 7be320a Cleanup fingerprinting for deconvolution-nhwc. by Quentin Khan ยท 10 days ago
  63. 4bff83d Correctly cleanup data when creating fully connected fingerprint. by Quentin Khan ยท 10 days ago
  64. e3c63b0 Fingerprint deconvolution-nhwc kernels packing operations. by Quentin Khan ยท 10 days ago
  65. 872fcb0 Use average pooling instead of convolution as a "dummy stencil" by Dillon Sharlet ยท 10 days ago
  66. 56573c2 Apply inline lhs pack only for convolution2d node by Vaisakh K V ยท 10 days ago
  67. 3106129 Use average pooling instead of convolution as a "dummy stencil" by Dillon Sharlet ยท 10 days ago
  68. eaf17a4 Added pf32 igemm kernel to fingerprint method by Vaisakh K V ยท 10 days ago
  69. cfb5bc1 Create single function for dim peeling and fusion by XNNPACK Team ยท 10 days ago
  70. 3fa93a2 Merge remote-tracking branch 'google/master' into pf32-igemm-sme1 by Vaisakh K V ยท 10 days ago
  71. 4b3f79d build: allow generated files to be read by GN and Bazel by Richard Townsend ยท 11 days ago
  72. e359448 Minor code refactor of elementwise.cc by XNNPACK Team ยท 11 days ago
  73. d52fe00 Re-enable tile_k = 1 x86 kernel for msan by Dillon Sharlet ยท 11 days ago
  74. ba0a32d When fingerprinting avoid generating values that will lead to undefined behaviour. by Quentin Khan ยท 11 days ago
  75. 92f1973 Refactor fully connected create functions. by Quentin Khan ยท 11 days ago
  76. 7f2dbd0 Optimize buffer handling for ternary ops by XNNPACK Team ยท 11 days ago
  77. e60d24e Change binary op to use raw_buffer by XNNPACK Team ยท 11 days ago
  78. f13e1b9 Optimize buffer handling for unary ops by XNNPACK Team ยท 11 days ago
  79. 457827b Skip unnecessary add for avx512bf16 reductions by Dillon Sharlet ยท 14 days ago
  80. 4d17030 Updated copy right year by Vaisakh K V ยท 2 weeks ago
  81. b11417e Added SME1 benchmark for xnn_pf16_gemm_minmax_ukernel_32x32c2__neonsme by Vaisakh K V ยท 2 weeks ago
  82. c17fd58 SME1 support for fp16 GEMM and IGEMM by Vaisakh K V ยท 2 weeks ago
  83. e890d51 build: add a basic DEPS file for GN by Richard Townsend ยท 2 weeks ago
  84. 2ce9fa9 Merge pull request #9178 from qualcomm:sme1/f32_int8_int8 by XNNPACK Team ยท 2 weeks ago
  85. 1b918df Update Slinky version. by Quentin Khan ยท 2 weeks ago
  86. ea72bb3 Added license header by Vaisakh K V ยท 2 weeks ago
  87. 62bfb5a Don't include a file from `src` in `xnnpack.h`. by Quentin Khan ยท 2 weeks ago
  88. a559de9 Updated Kleidiai by Vaisakh K V ยท 2 weeks ago
  89. 79192a4 Pass down the fingerprint key in convolution kernels. by Quentin Khan ยท 2 weeks ago
  90. 4fd530d Updated Kleidiai version to pull the fixed matmul_clamp_f32_qai8dxp_qsi8cxp SME1 variant by Vaisakh K V ยท 2 weeks ago
  91. 61fc839 Fixed syntax error in yaml and regenerated the test case by Vaisakh K V ยท 2 weeks ago
  92. 7b0b9f7 Changing to uint2 computations in kernels. by Misha Gutman ยท 2 weeks ago
  93. 3c0e6a3 Fingerprint convolution-nhwc kernels packing operations. by Quentin Khan ยท 2 weeks ago
  94. fd14fbd Fingerprint convolution-nchw kernels packing operations. by Quentin Khan ยท 2 weeks ago
  95. b3fea56 Don't assume that shapes are correct when doing the static shape propagation. by Quentin Khan ยท 2 weeks ago
  96. 454b1b5 Add `dot_flag::consistent_arithmetic` to the basic kernels when there are no other kernels by Dillon Sharlet ยท 3 weeks ago
  97. d88e825 Remove `build_config.h` header by Dillon Sharlet ยท 3 weeks ago
  98. b19450a Add check for contiguity of output buffers by XNNPACK Team ยท 3 weeks ago
  99. 992770f Fix race condition in fingerprinting by Dillon Sharlet ยท 3 weeks ago
  100. d3efd0a Add rounding for conversion of fp32 to bf16 by Dillon Sharlet ยท 3 weeks ago