233博客
  • 首页
  • 登录
  • 2025-07-25T06:08:31Z
    note
    docker run --gpus all --name triton_gpu_volumes --user $(id -u):$(id -g) -v /home/ruibohuang/docker:/workspace -it ruibo_triton_gpu_1 /bin/bash
    查看全文
  • 2025-07-25T06:07:50Z
    note
    3.28 入职流程以及电脑环境配置,简会确定需求 3.31 出差深圳,spc编译完成,进一步确定4月计划的类triton方案 4.1 了解cute layout_algebra,配置 10.1.100.63 机器安装debug版 triton,tips:用的61da7c726a63 image;docker内可能需要修改链接;docker内还需要libzstd-dev,docker还需要run --gpus all 来运行 4.2 进一步整理Matmul 算子接口,确定launchKernel的参数,写了 类 triton 方案分析 4.3 类triton方案 初版和pe id mapping探索 4.7 pe id mapping 计算完成以及类triton方案完善,下午进行了坏PE的特别考虑 4.8 计划看懂 RotaryEmbNeoxNoOffsets 算子,知道了大致逻辑 4.9 坏PE对方案的修改 4.10 考虑坏PE的广播问题,并进行讨论 4.11 出差以及修改和固件的方案 4.14 类triton方案完善 4.15 尝试 Rotary Emb 的triton算子编写 4.16 尝试 RotaryEmb 的 SPAC编写 4.17 triton第三方接入调研,面试triton-sparse通过iree-run-module测试情况 4.18 triton接入尝试,下班前搞定了空backend 添加,转换到tt IR。triton 添加新backend 4.21 尝试新增Pass pipeline 4.22 新增Pass pipeline成功, triton 添加新的Pass pipeline 4.23 04-matrix-multiplication-use-heuristics.py 利用heuristics去autotune我们的grid,增加TritonPtrToMemref Pass 4.24 研究triton IR的tensor类型转换,发现triton-shared的转换我可以借用,找最小改动单元 4.25 triton-sparse 已可以得到linalg IR,检验其向前对应和向后的对应关系 4.27 引入SPAsync dialect,解决了LLVM升级导致的API变动和IREE和triton增加新dialect不同的问题。 4.28 修改Pass 流程为TritonToBufferization、BufferizationToSPAsync 将其转换到SPAsync dialect,已经可以跑完。 4.29 完善相关文档 如 triton-sparse 项目介绍、triton-sparse 新增Pass方法、triton 添加新backend等。 4.30 研究现有cmodel流程,感觉只要跑起来就好了 5.6 补充README.md,去跑cmodel 5.7 安装虚拟化软件。跑通 v2_soc_simulator 5.8 尝试runtime编译选项 5.9 周会,尝试增加sigmoid Op 5.12 sigmoid Op 添加完成,triton 新增Op 5.13 测试新的fake_host,使用fakehost测试算子 5.14 fake_host和spc-compile接入triton-sparse 5.15 fake_host 跑通triton-sparse流程 5.16 测试silu,版本有问题 5.19 silu跑通,有循环的在更新后跑通了 5.20 triton-sparse launch方案 5.21 n_elements 传递有问题,计划和soc模拟器建立回归测试机制 5.22 查到spc和模拟器存在接口,意外发现 https://github.com/FlagTree/flagtree-preview https://github.com/FlagTreeZhouyi/flagtree-zhouyi好东西 5.23 给v2模拟器添加cases 5.26 合并triton-sparse,查bug并修复,添加case 5.27 测试rope,答案错误 5.28 研究pytorch中的triton 5.29 看vllm源码。给i32参数传递例子 5.30 接手 reinterpret cast 转换 6.3 fix reinterpret cast 转换,v2模拟器的算子升级,折腾服务器环境 6.4 rope对上答案,服务器10.1.20.143系统版本是ubuntu18.04,soc模拟器不能运行 6.5 折腾 10.1.110.110的环境 6.6 driver安装,runtime 测试 6.9 Triton的launch分析-动态shape 6.10 折腾依赖torch+SOLA+模拟器的launch环境,没成功 6.11 triton-sparse launch方案(SOLA)-设计中 6.12 tt.make_tensor_ptr转换为memref.reinterpret_cast+memref.subview,纯动态可以的 6.13 动态结合失败,报错mlir/lib/Interfaces/ViewLikeInterface.cpp#L34 expected 0 dynamic offset values 6.16 memref dialect 学习 6.17 rope的make_tensor_ptr的降级,实现了2种,dongben对subiew上用offset在iree-runtime对上了答案 6.18 尝试10.1.9.110 搭建全流程环境,spc必须使用clang编译,不要用gcc,一堆 error,由于升级的问题,fake_host 依赖不了了 6.19 尝试iree-run-module接入triton-sparse,接下来可能做个hlo的mlir-runner 6.20 iree-run-module 的参数准备完成,Use iree run module replace fake host 6.23 spc升级对 iree-run-module报错并解决,和houjue确认动态shape需要给大小。triton-sparse通过iree-run-module测试情况 6.24 hlo-runner 启动,预计先接现在的东西。 sparse-runner编译 6.25 三方库处理,cmake 文件修改 6.26 triton-sparse 和 sparse-runner 的接口定义及联调 6.27 更改quark_sim依赖,并修改源码。triton-sparse在sparse-runner测试情况 6.30 全部合并到开发分支。CI框架搭建了解,gitlab-runner安装和配置 7.1 CI例子书写,并完善CI测试流程 7.2 Merge branch 'add-gitlab-ci' into 'develop' CI成功,开始hlo-runner的简单开发 7.3 triton 去除联网依赖 7.4 多层级runner构建 7.7 完全离线NEXUS构建和测试 7.8 CI独立 7.9 CI独立kernel完成,NEXUS合并develop并测试 7.10 推送消息到企业微信,合并 rmsnorm 7.11 折腾torch环境 7.14 使用单独定义的算子库,折腾头文件及编译 7.15 triton-sparse集成spc、sprse-runner,项目依赖 7.16 编译打包 7.17 Python代码编译解决 7.18 sparse-runner报-opaque-pointers错误 7.21 flatbuffers使用triton-sparse依赖的clang编译 7.22 sphal.dump加回sparse-runner 7.23 向量加算子定义和单测脚本 7.24 向量加算子demo完成 7.25 矩阵乘算子demo
    查看全文
  • 2022-10-31T06:56:53Z
    note
    void test_if_and(long* dst) { int a=1; if (a==1 and a%3==5) { a=0x76a1; } } void test_double_bf(long* dst) { //isa_replace int i0=0x5f, i1=0x5f; int i2=1,i3=1, i4=0x7f; int a[10]; asm("##start"); asm volatile( "i.sfeqi %3 1\n" "c.bf .HRB_Label\n" " .p2align 3\n" "i.addi %1 zr 0x4f\n" ".HRB_Label:\n" "i.addi %0 zr 0x3f\n" "c.bf .HRB_Label1\n" " .p2align 3\n" "i.addi %1 %1 0x5f\n" ".HRB_Label1:\n" "i.addi %0 %0 0x6f\n" :"=r"(i0),"=r"(i1) :"r"(i2),"r"(i3),"r"(i4) ); asm("##end"); dst[0] = i0; dst[1] = i1; dst[2] = i2; dst[3] = i3; dst[4] = i4; } int test_ssjs() { // int i1=0x76a1,i2=0x1; return 0x76a1*0x1; } int test_func1(long* dst) { int i0=0x3f, i1=0x3f; asm("##test_func1 start"); asm volatile( "i.mov %0 lr\n" :"=r"(i0) ); long c=0x1122334455667788LL,d=0x11223344556677LL,e; e=c+d; c=0x1122334455667788LL,d=0x11223344556677LL; e=c+d; c=0x1122334455667788LL,d=0x11223344556677LL; e=c+d; c=0x1122334455667788LL,d=0x11223344556677LL; e=c+d; c=0x1122334455667788LL,d=0x11223344556677LL; e=c+d; asm("##test_func1 end"); // dst[1]=0+test_ssjs(); return i0+i1; } int test_func2(long* dst) { int i0=0x3f, i1=0x3f; asm("##test_func2 start"); asm volatile( "i.mov %0 lr\n" :"=r"(i0) ); asm("##test_func2 end"); dst[2]=i0; return i0+i1; } long test_func3(long* dst) { int i0=0x3f, i1=0x3f; asm("##test_func3 start"); asm volatile( "i.mov %0 lr\n" :"=r"(i0) ); asm("##test_func3 end"); dst[3]=i0; return dst[3]; } float test_func4(long* dst) { int i0=0x3f, i1=0x3f; asm("##test_func4 start"); asm volatile( "i.mov %0 lr\n" :"=r"(i0) ); asm("##test_func4 end"); dst[4]=i0; return dst[4]; } double test_func5(long* dst) { int i0=0x3f, i1=0x3f; asm("##test_func5 start"); asm volatile( "i.mov %0 lr\n" :"=r"(i0) ); asm("##test_func5 end"); dst[5]=i0; return dst[5]; } //float 指令多了什么,已被注释 float test_func6(long* dst) { int i0=0x3f, i1=0x3f; asm("##test_func5 start"); asm volatile( "i.mov %0 lr\n" :"=r"(i0) ); asm("##test_func5 end"); dst[6]=i0; return dst[6]; } double test_func7(long* dst) { int i0=0x3f, i1=0x3f; asm("##test_func5 start"); asm volatile( "i.mov %0 lr\n" :"=r"(i0) ); asm("##test_func5 end"); dst[7]=i0; return dst[7]; } float test_func8(long* dst) { int i0=0x3f; asm("##test_func5 start"); asm volatile( "i.mov %0 lr\n" :"=r"(i0) ); asm("##test_func5 end"); dst[8]=i0; return dst[8]; }
    查看全文
  • 2022-10-30T13:16:30Z
    note
    if not is_confirm_in_time(): messagebox.showerror("错误", "超过7天未激活,请联系黄睿博,手机号 178-5826-3110") root.quit() else: root.mainloop()
    查看全文
  • 2022-10-30T13:13:12Z
    note
    git config credential.helper store
    查看全文
  • 2022-10-30T13:00:47Z
    note
    import os import json from enum import Enum from datetime import datetime, timedelta from api import open_success current_mode = '2' current_setting_dic = {} button_list = [] TimeButtonList = [] kind_time_list = [] time_str_list = ['男生时间', '女生时间', '休息时间'] times = 3 class RunMode: mode_type = 2 male_time = 90 female_time = 100 rest_time = 150 def __init__(self, **kwargs): for k, v in kwargs.items(): setattr(self, k, v) def default_config(): t_current_setting_dic = {'current_mode': str(2), 'confirm':'0', 'first_open_time': datetime.now().strftime("%Y-%m-%d_%H_%M"), '0': class2dict(RunMode(mode_type=0, female_time=0)), '1': class2dict(RunMode(mode_type=1, male_time=0)), '2': class2dict(RunMode()) } global current_setting_dic current_setting_dic = t_current_setting_dic save_config() def save_config(): resource_dir = os.path.join('resources') if not os.path.exists(resource_dir): os.makedirs(resource_dir) fp = open(os.path.join('resources', 'config.json'), 'w') fp.write(json.dumps(current_setting_dic)) fp.close() def load_config_by_file(): fp = open(os.path.join('resources', 'config.json'), 'r') # print(fp.read()) setting_json = json.loads(fp.read()) fp.close() if setting_json.get('current_mode') is None: default_config() global current_setting_dic, current_mode current_setting_dic = setting_json current_mode = setting_json.get('current_mode') def class2dict(f): return dict((name, getattr(f, name)) for name in dir(f) if not name.startswith('__')) def get_current_setting_dic(): return current_setting_dic def confirm_success(): current_setting_dic['confirm'] = 'run_confirm' save_config() def confirm_fail(): current_setting_dic['confirm'] = '0' save_config() def load_config(): try: load_config_by_file() except: # print('首次运行,初始化配置') default_config() if open_success(0) == True: confirm_success() else: confirm_fail() load_config() class ModType(Enum): male = 0 female = 1 male_and_female = 2 def modify_mode_fuc(new_mode): global current_mode current_setting_dic['current_mode'] = str(new_mode) current_mode = current_setting_dic['current_mode'] def get_mode(): return current_mode def get_kind_time_fuc(loop_index): if loop_index == 0: return current_setting_dic[current_mode]['male_time'] if loop_index == 1: return current_setting_dic[current_mode]['female_time'] if loop_index == 2: return current_setting_dic[current_mode]['rest_time'] def set_kind_time_fuc(loop_index, new_time): if loop_index == 0: current_setting_dic[current_mode]['male_time'] = new_time if loop_index == 1: current_setting_dic[current_mode]['female_time'] = new_time if loop_index == 2: current_setting_dic[current_mode]['rest_time'] = new_time def get_format_kind_time_fuc(loop_index): kind_time = get_kind_time_fuc(loop_index) return f'{kind_time // 60}:{kind_time % 60}' def get_time_list_fuc(): time_list = [] time_status_list = [] # print(current_setting_dic[current_mode]) if current_setting_dic[current_mode]['male_time']: time_list.extend([10, current_setting_dic[current_mode]['male_time'], current_setting_dic[current_mode]['rest_time']]) time_status_list.extend(['倒计时', '男生', '休息']) if current_setting_dic[current_mode]['female_time']: time_list.extend([10, current_setting_dic[current_mode]['female_time'], current_setting_dic[current_mode]['rest_time']]) time_status_list.extend(['倒计时', '女生', '休息']) return time_list, time_status_list def get_confirm_info(): return current_setting_dic.get('confirm') def is_confirm(): return get_confirm_info() == 'run_confirm' def is_confirm_in_time(): first_open_time_str = current_setting_dic.get('first_open_time') if first_open_time_str is None: current_setting_dic['first_open_time'] = datetime.now().strftime("%Y-%m-%d_%H_%M") save_config() return True first_open_time = datetime.strptime(first_open_time_str, "%Y-%m-%d_%H_%M") return is_confirm() or datetime.now() < first_open_time + timedelta(days=7) def is_first_open(): if not os.path.exists(os.path.join('resources', 'config.json')): return True load_config_by_file() first_open_time_str = current_setting_dic.get('first_open_time') return datetime.now() < first_open_time + timedelta(years=1)
    查看全文
  • 2022-10-30T13:00:31Z
    note
    try: from Tkinter import * from Tkinter import messagebox, simpledialog except ImportError: from tkinter import * from tkinter import messagebox, simpledialog from timer import Timer from draw_widget import draw_mode_show_frame, draw_modify_current_mode, draw_time_setting, draw_control_frame,\ add_time_list, draw_times_frame from play_music import * from run_config import is_confirm_in_time root = Tk() def get_pwd(n): char_lst = "ZPAQMOVJGCWIFXTUSNLEHRYBDK" pwd_str = '' for i in range(10): pwd_str = pwd_str + char_lst[n%26] n = n // 29 return pwd_str[1]+ pwd_str[2] + pwd_str[4] + pwd_str[3] + pwd_str[0] def get_num(): import time return (int(time.time())//60*1489+109)//21*57 def ask_code(): from datetime import datetime dt = datetime.now().strftime('%Y/%m/%d %H:%M') toplevel = Toplevel(root) toplevel.title('验证信息') toplevel.mainloop() # result = simpledialog.askstring(title = '信息',prompt=f'现在是{dt},请联系黄睿博输入您的授权码:', buttons=["确定"]) def main(): import time if is_first_open(): ask_code() # messagebox.showerror("错误", "超过7天未激活,请联系黄睿博,手机号 178-5826-3110") if ask_code()!=get_pwd(get_num()): root.quit() root.iconbitmap(os.path.join('resources', 'icon.ico')) root.resizable(False, False) root.title("台州学院附属中学跑步系统") root.configure() add_time_list() frame2 = Frame(root, relief=RAISED) sw = Timer(frame2) import draw_widget draw_widget.sw = sw draw_mode_show_frame(root) draw_modify_current_mode(root) draw_time_setting(root) frame2.pack() sw.pack(pady=5) draw_control_frame(root) root.mainloop() if __name__ == '__main__': main()
    查看全文
  • 2022-10-30T01:12:15Z
    note
    v2ray修改步骤 1.caddy配置 /etc/caddy 2.改好重启 systemctl reload caddy 3.ufw allow port (ufw status numbered verbose 查看规则)
    查看全文
  • 2022-10-30T00:40:45Z
    note
    vmess://eyJhZGQiOiJhY2pveS54eXoiLCJhaWQiOiIwIiwiaG9zdCI6ImFjam95Lnh5eiIsImlkIjoiNjhmY2JlNjYtODQwZS00ODA1LTgzZWQtNmMxZDNiNWZiYWMzIiwibmV0Ijoid3MiLCJwYXRoIjoiL3BvdCIsInBvcnQiOiI2NDUzIiwicHMiOiIyMzN2Mi5jb21fYWNqb3kueHl6IiwidGxzIjoidGxzIiwidHlwZSI6Im5vbmUiLCJ2IjoiMiJ9
    查看全文
  • 2022-10-30T00:37:45Z
    note
    asm_real_format_lst = isa_info_query.asm_real_format.split(' ') product_lst = [] for i in range(1, len(asm_real_format_lst)): if asm_real_format_lst[i] != 'imm': product_lst.append([i for i in range(32)]) else: product_lst.append([i for i in range(2**9)]) asm_lst = [] import itertools as it for num_lst in it.product(*product_lst): asm_lst.append(f'{isa_info_query.mnemonic} {generate_one_isa(num_lst, asm_real_format_lst[1:])}')
    查看全文
  • «
  • 1
  • »