How to force the compiler to pass a "vector of 4" wrapper class as single XMM register?

I'm trying to optimize a small "vector of 4 floats" wrapper class, and of course I want to make it convenient as well. For example:

typedef float v4f __attribute__ ((vector_size (16)));

struct V4 {

    union {
        v4f packed;
#if 1
        struct { float r, g, b, a; };
#endif
#if 1
        float data[4];
#endif
    };

    V4() = default;
    V4(v4f v) : packed(v) {}
};

V4 AddV4(V4 a, V4 b) { 
    return a.packed + b.packed; 
}
V4 MulV4(V4 a, V4 b) { 
    return a.packed * b.packed; 
}

static_assert(sizeof(V4) == 16);

I know the union is undefined behavior in theory, but in practice it's working fine ;-)

The problem is the following: I tested this in godbolt (see https://godbolt.org/z/fXbtre), using both gcc and clang, with the command line arguments:

-O3  -fomit-frame-pointer -fno-rtti -fno-exceptions -mavx -ffast-math

If I disable both the struct and the array from the union (i.e. set both to #if 0), I get a really compact AddV4 and MulV4 functions, e.g.:

AddV4(V4, V4):
        vaddps  xmm0, xmm0, xmm1
        ret

But if I enable ANY of those two, I get:

AddV4(V4, V4):
        vmovq   QWORD PTR [rsp-32], xmm1
        vmovq   QWORD PTR [rsp-40], xmm0
        vmovaps xmm5, XMMWORD PTR [rsp-40]
        vmovq   QWORD PTR [rsp-24], xmm2
        vmovq   QWORD PTR [rsp-16], xmm3
        vaddps  xmm4, xmm5, XMMWORD PTR [rsp-24]
        vmovaps XMMWORD PTR [rsp-40], xmm4
        mov     rax, QWORD PTR [rsp-32]
        vmovq   xmm0, QWORD PTR [rsp-40]
        vmovq   xmm1, rax
        mov     QWORD PTR [rsp-24], rax
        ret

Can someone explain why? Is there a compiler flag for gcc/clang I could use to fix this? Or is it really the only option to use only the packed data structure? (in that case I need to write accessor methods x(), y(), z(), w(), and that would be quite a big change in our codebase, hence I would prefer another option first).

How to force the compiler to pass a "vector of 4" wrapper class as single XMM register?

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112