合肥生活安徽新聞合肥交通合肥房產生活服務合肥教育合肥招聘合肥旅游文化藝術合肥美食合肥地圖合肥社保合肥醫院企業服務合肥法律

        代做Computer Architecture、代寫Gem5 編程

        時間:2024-06-08  來源:合肥網hfw.cc  作者:hfw.cc 我要糾錯



        Computer Architecture
        2024 Spring
        Final Project Part 2Overview
        Tutorial
        ● Gem5 Introduction
        ● Environment Setup
        Projects
        ● Part 1 (5%)
        ○ Write C++ program to analyze the specification of L1 data cache.
        ● Part 2 (5%)
        ○ Given the hardware specifications, try to get the best performance for more 
        complicated program.
        2Project 2
        3In this project, we will use a two-level cache 
        computer system. Your task is to write a 
        ViT(Vision Transformer) in C++ and optimize it. 
        You can see more details of the system 
        specification on the next page.
        Description
        4System Specifications
        ● ISA: X86
        ● CPU: TimingSimpleCPU (no pipeline, CPU stalls on every memory request)
        ● Caches
        * L1 I cache and L1 D cache connect to the same L2 cache
        ● Memory size: 8192MB
        5
        I cache 
        size
        I cache 
        associativity
         D cache 
        size
        D cache 
        associativity
        Policy Block size
        L1 cache 16KB 8 16KB 4 LRU **B
        L2 cache – – 1MB 16 LRU **BViT(Vision Transformer) – Transformer Overview
        6
        ● A basic transformer block consists of 
        ○ Layer Normalization
        ○ MultiHead Self-Attention (MHSA) 
        ○ Feed Forward Network (FFN)
        ○ Residual connection (Add)
        ● You only need to focus on how to 
        implement the function in the red box
        ● If you only want to complete the project 
        instead of understanding the full 
        algorithm about ViT, you can skip the 
        section masked as redViT(Vision Transformer) – Image Pre-processing
        7
        ● Normalize, resize to (300,300,3) and center crop to (224,224,3)ViT(Vision Transformer) – Patch Encoder
        8
        ● In this project, we use Conv2D as Patch 
        Encoder with kernel_size = (16,16), stride = 
        (16,16) and output_channel = 768
        ● (224,224,3) -> (14,14, 16*16*3) -> (196, 768)ViT(Vision Transformer) – Class Token
        9
        ● Now we have 196 tokens and each 
        token has 768 features
        ● In order to record global information, we 
        need concatenate one learnable class 
        token with 196 tokens
        ● (196,768) -> (197,768)ViT(Vision Transformer) – Position Embedding
        10
        ● Add the learnable position information 
        on the patch embedding
        ● (197,768) + 
        position_embedding(197,768) -> 
        (197,768)ViT(Vision Transformer) – Layer Normalization
        11
        T
        # of tokens
        C
        embedded dimension
        ● Normalize each token
        ● You need to normalize with the formulaAttention
        ViT(Vision Transformer) – MultiHead Self Attention (1)
        12
        ● Wk
        , Wq
        , Wv 
        ∈ RC✕C
        ● b
        q
         , bk
        , bv
        ∈ RC
        ● W

        ∈ RC✕C
         
        ● b
        o
         ∈ RC
        Input
        Linear
        Projection
        X Attention
        split 
        into 
        heads
        merge 
        heads
        Output
        Linear
        Projection
        Y
        Wk
        , Wq
        , Wv W

        b
        q
         , bk
        , bv b
        o
         ViT(Vision Transformer) – MultiHead Self Attention (2)
        13
        T
        # of tokens
        C
        embedded dimension
        ● Get Q, K, V ∈ RT✕(NH*H) after input linear projection
        ● Split Q, K, V into Q1
        , Q2
        , Q3
        ,..., QNH K1
        , K2
        , K3
        ,..., KNH V1
        , V2
        , V3
        ,..., VNH 
        ∈ RT✕H
        H
        hidden dimension
        Linear Projection and split into heads
        Linear Projection
        Q = XWq
        T
         + b
        q
        K = XWk
        T
         + bk
        V = XW
        v
        T
         + b
        v
        NH
        # of head C = H * NHViT(Vision Transformer) – MultiHead Self Attention (2)
        14
        ● For each head i, compute Si
         = QiKi
        T
        /square_root(H) ∈ RT✕T
        ● Pi = Softmax(Si
         ) ∈ RT✕T
        , Softmax is a row-wise function
        ● Oi = Pi Vi ∈ RT✕H
        Matrix
        Multiplication
        and scale
        Qi
        Ki
        Softmax
        Matrix
        Multiplication Vi
        Oi
        SoftmaxViT(Vision Transformer) – MultiHead Self Attention (3)
        15
        T
        # of tokens
        C
        embedded dimension
        ● Oi ∈ RT✕H
        , O = [O1
        , O2
        ,...,O2
         ]
        H
        hidden dimension
        merge heads and Linear Projection
        Linear Projection
        output = OWo
        T
         + b
        o
        NH
        # of headViT(Vision Transformer) – Feed Forward Network
        16
        ● Get Q, K, V ∈ RT✕(h*H) after input linear projection
        ● Split Q, K, V into Q1
        , Q2
        , Q3
        ,..., Qh
         K1
        , K2
        , K3
        ,..., Kh V1
        , V2
        , V3
        ,..., Vh ∈ RT✕H
        T
        # of tokens
        C
        embedded dimension
        Input
        Linear
        Projection
        T
        # of tokens
        OC
        hidden dimension
        GeLU
        output
        Linear
        ProjectionViT(Vision Transformer) – GeLU
        17ViT(Vision Transformer) – Classifier
        18
        ● Contains a Linear layer to transform 768 features to 200 class
        ○ (197, 768) -> (197, 200)
        ● Only refer to the first token (class token)
        ○ (197, 200) -> (1, 200)ViT(Vision Transformer) – Work Flow
        19
        Pre-pocessing
        Embedder
        Transformer x12
        Classifier
        m5_dump_init
        Load_weight
        m5_dump_stat
        Argmax
        layernorm
        MHSA
        layernorm
        FFN
        matmul
        attention
        matmul
        matmul
        layernorm
        matmul
        Black footed Albatross
        +
        +
        gelu
        matmul
        gelu
        $ make gelu_tb
        $ make matmul_tb
        $ make layernorm_tb
        $ make MHSA_tb
        $ make feedforward_tb
         $ make transformer_tb
        $ run_all.sh
        layernorm
        layernorm
        MHSA
        residualViT(Vision Transformer) – Shape of array
        20
        layernorm token 1 token 2 …… token T
        C
        input/output [T*C]
        MHSA input/output/o [T*C]
        MHSA qkv [T*3*C] q token 1
        C
        k token 1 v token 1 …… q token T k token T v token T
        feedforward input/output [T*C]
        feedforward gelu [T*OC] token 1
        OC
        token 2 …… token TCommon problem
        21
        ● Segmentation fault
        ○ ensure that you are not accessing a nonexistent memory address
        ○ Enter the command $ulimit -s unlimited All you have to do is
        22
        ● Download TA’s Gem5 image
        ○ docker pull yenzu/ca_final_part2:2024
        ● Write C++ with understanding the algorithm in ./layer folder
        ○ make clean
        ○ make <layer>_tb
        ○ ./<layer>_tbAll you have to do is
        23
        ● Ensure the ViT will successfully classify the bird
        ○ python3 embedder.py --image_path images/Black_Footed_Albatross_0001_796111.jpg 
        --embedder_path weights/embedder.pth --output_path embedded_image.bin
        ○ g++ -static main.cpp layer/*.cpp -o process
        ○ ./process
        ○ python3 run_model.py --input_path result.bin --output_path torch_pred.bin --model_path 
        weights/model.pth
        ○ python3 classifier.py --prediction_path torch_pred.bin --classifier_path 
        weights/classifier.pth
        ○ After running the above commands, you will get the following top5 prediction.
        ● Evaluate the performance of part of ViT, that is layernorm+MHSA+residual
        ○ Need about 3.5 hours to finish the simulation
        ○ Check stat.txtGrading Policy
        24
        ● (50%) Verification
        ○ (10%) matmul_tb
        ○ (10%) layernorm_tb
        ○ (10%) gelu_tb
        ○ (10%) MHSA_tb
        ○ (10%) transformer_tb
        ● (50%) Performance
        ○ max(sigmoid((27.74 - student latency)/student latency))*70, 50)
        ● You will get 0 performance point if your design is not verified.Submission
        ● Please submit code on E3 before 23:59 on June 20, 2024.
        ● Late submission is not allowed.
        ● Plagiarism is forbidden, otherwise you will get 0 point!!!
        25
        ● Format
        ○ Code: please put your code in a folder 
        named FP2_team<ID>_code and compress 
        it into a zip file.
        2
        2
        2FP2_team<ID>_code folder 
        26
        ● You should attach the following documents
        ○ matmul.cpp
        ○ layernorm.cpp
        ○ gelu.cpp
        ○ attention.cpp
        ○ residual.cpp

        請加QQ:99515681  郵箱:99515681@qq.com   WX:codinghelp



















         

        掃一掃在手機打開當前頁
      1. 上一篇:代做QBUS3600、代寫Python設計程序
      2. 下一篇:哪些人可以辦理菲律賓團簽呢(跟團簽的材料)
      3. 無相關信息
        合肥生活資訊

        合肥圖文信息
        出評 開團工具
        出評 開團工具
        挖掘機濾芯提升發動機性能
        挖掘機濾芯提升發動機性能
        戴納斯帝壁掛爐全國售后服務電話24小時官網400(全國服務熱線)
        戴納斯帝壁掛爐全國售后服務電話24小時官網
        菲斯曼壁掛爐全國統一400售后維修服務電話24小時服務熱線
        菲斯曼壁掛爐全國統一400售后維修服務電話2
        美的熱水器售后服務技術咨詢電話全國24小時客服熱線
        美的熱水器售后服務技術咨詢電話全國24小時
        海信羅馬假日洗衣機亮相AWE  復古美學與現代科技完美結合
        海信羅馬假日洗衣機亮相AWE 復古美學與現代
        合肥機場巴士4號線
        合肥機場巴士4號線
        合肥機場巴士3號線
        合肥機場巴士3號線
      4. 上海廠房出租 短信驗證碼 酒店vi設計

        主站蜘蛛池模板: 国产成人精品一区二区三区免费| 极品人妻少妇一区二区三区| 国产高清在线精品一区二区| 女人和拘做受全程看视频日本综合a一区二区视频 | 中文字幕一精品亚洲无线一区| 无人码一区二区三区视频| 精品亚洲一区二区| 国产视频一区在线播放| 亚洲爆乳精品无码一区二区三区| 国产丝袜视频一区二区三区| 日本欧洲视频一区| 2020天堂中文字幕一区在线观| 亚洲国产系列一区二区三区| 国产日韩精品一区二区在线观看 | 国产精品一区二区三区99| 亚洲AV无码一区二三区| 久久精品一区二区三区资源网| 中文字幕精品一区二区2021年| 国产一区二区在线观看视频| 成人无码AV一区二区| 日韩国产免费一区二区三区| 国产美女口爆吞精一区二区| 精品国产一区二区三区不卡| 国产成人久久精品麻豆一区| 69久久精品无码一区二区| 一区二区三区免费在线视频 | 久久国产精品无码一区二区三区 | 国产乱人伦精品一区二区| 影院无码人妻精品一区二区| 一区二区三区在线|欧| 动漫精品一区二区三区3d| 久久久无码精品国产一区| 一区在线观看视频| 亚洲AV无码一区二区三区网址| 综合无码一区二区三区| 国产伦精品一区二区三区免费迷| 一区二区三区四区免费视频| 国产免费一区二区三区免费视频 | 综合激情区视频一区视频二区| 青青青国产精品一区二区| 久久一区二区三区免费|