發表文章

目前顯示的是 2019的文章

LLVM Machine Instruction: Convergent attribute

ref:  http://lists.llvm.org/pipermail/llvm-dev/2015-August/089241.html 1. Convergent attribute is useful for SIMT/SPMD programming model. 2. Intended interpretation is that a convergent operation cannot be move either into or out of a conditionally executed region. 3. If you have a convergent instruction A, it islegal to duplicate it to instruction B if (assuming B is after A in program flow) A dominates B and B post-dominates A. case: r1 = texture2D(..., r0, ...) if (...) { // r0 used as temporary here r0 = ... r2 = r0 + ... } else { // only use of r1 r2 = r1 + ... } In this example, various optimizations might try to sink the texture2D operation into the else block, like so: if (...) { r0 = ... r2 = r0 + ... } else { r1 = texture2D(..., r0, ...) r2 = r1 + ... } In most SPMD/SIMT implementations, the fallout of this races is exposed via the predicated expression of acyclic control flow: pred0 <- cmp ... if (pred0) r0 = ... ...

Stage Mix

圖片
stage mix幾乎都是剪輯那些韓國多人團體的作品 要滿足 1. 工業化一致的攝影方式跟攝影器材 2. 軍隊式標準的舞蹈 3. 細心的剪接 才能辦到 工業化一致的分鏡是必備的, 因為大團體每個人都要妥善的分配上鏡時間 軍隊式標準的舞蹈也是必備的, 因為跳錯會影響精心設計過後的上鏡畫面 我是在想說... 是不是要這樣幹 大家才會想去看現場表演 因為只有在現場 才能緊叮你的偶像片刻不移 看到平常看不到的畫面... 這樣發售的某次演場會影片倒是很無聊 因為就只是換衣服跟場景嘛~(?) see  https://blog.edumeme.org/2017/03/blog-post.html

ARMv7 NEON VQRDMULH instruction implementation

VQRDMULH : Vector Saturating Rounding Doubling Multiply Returning High Half. VQRDMULH multiplies corresponding elements in two vectors, doubles the results, and places the most significant half of the final results in the destination vector. implement reference code https://github.com/google/gemmlowp/blob/master/fixedpoint/fixedpoint.h#L329 <code> // This function implements the same computation as the ARMv7 NEON VQRDMULH // instruction. template <> inline std:: int32_t SaturatingRoundingDoublingHighMul (std:: int32_t a, std:: int32_t b) { bool overflow = a == b && a == std::numeric_limits<std:: int32_t >:: min (); std:: int64_t a_64 (a); std:: int64_t b_64 (b); std:: int64_t ab_64 = a_64 * b_64; std:: int32_t nudge = ab_64 >= 0 ? ( 1 << 30 ) : ( 1 - ( 1 << 30 )); std:: int32_t ab_x2_high32 = static_cast <std:: int32_t >((ab_64 + nudge) / ( 1ll << 31 )); return overflow ? std::numeric_limits...