Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/5g: calling a method via interface is much slower in ARM #9636

Closed
siritinga opened this issue Jan 19, 2015 · 6 comments
Closed

cmd/5g: calling a method via interface is much slower in ARM #9636

siritinga opened this issue Jan 19, 2015 · 6 comments

Comments

@siritinga
Copy link

I've done a very simple program to benchmark a direct method call vs. method call via interface: http://play.golang.org/p/0yWkDh_fut

If I run it on amd64 I get very similar performance (using Go 1.4.1):
$ go test -bench=.
BenchmarkIntMethod 1000000000 2.22 ns/op
BenchmarkInterface 1000000000 2.60 ns/op

However, if I run it on ARM (Raspberry Pi), the interface benchmark is much slower:
$ go test -bench=.
BenchmarkIntMethod 100000000 16.1 ns/op
BenchmarkInterface 20000000 73.6 ns/op

Is this an expected behaviour?

@bryanturley
Copy link

The raspberry pi uses a very old arm processor. It may just be a difference in archs.
I am surprised the direct call and the interface call are that close on your amd64 machine.
Try benchmarking a func pointer call vs direct func call and see if you get similar results.

@siritinga
Copy link
Author

Maybe it is simply that the ARM lacks some Intel feature that makes the call so fast, or maybe there is some optimization missing. Just in case I opened the issue.

Regarding the func pointer, I'm not sure if you mean this: http://play.golang.org/p/tLq4vYMgql . In this case, for amd64 it is 0.4 vs. 5 ns/op. For ARM, it is 12 vs 63 ns/op, so the ratio is the same.

@minux
Copy link
Member

minux commented Jan 19, 2015

I've checked disassembly, and it's as expected.

Indirect calls are just expensive on arm11.

Please try benchmark a C version with indirect function calls and see it's
consistent with the result from Go. 5g does a bad job of code generation,
we all know that, but I can't see how it's that bad if indirect calls are
not significantly more expensive than direct calls.

@davecheney
Copy link
Contributor

That sounds reasonable, an indirect jump through a register is harder to
predict.

I look forward to spending some quality time with 5g after the c2go
transition.
On 20 Jan 2015 09:48, "Minux Ma" notifications@github.com wrote:

I've checked disassembly, and it's as expected.

Indirect calls are just expensive on arm11.

Please try benchmark a C version with indirect function calls and see it's
consistent with the result from Go. 5g does a bad job of code generation,
we all know that, but I can't see how it's that bad if indirect calls are
not significantly more expensive than direct calls.


Reply to this email directly or view it on GitHub
#9636 (comment).

@mikioh mikioh changed the title Calling a method via interface is much slower in ARM cmd/5g: calling a method via interface is much slower in ARM Jan 19, 2015
@larryr
Copy link

larryr commented Feb 8, 2015

for what its worth; the NVIDIA Jetson TK1 gave (using a recent tip build)
$ go test -bench=.
testing: warning: no tests to run
PASS
BenchmarkIntMethod 500000000 3.49 ns/op
BenchmarkInterface 200000000 6.93 ns/op
ok hello 4.203s
$ go version
go version devel +dfc4997 Sun Jan 25 22:46:49 2015 +0000 linux/arm

@rsc
Copy link
Contributor

rsc commented Apr 10, 2015

The assembly is fine:

g% cat /tmp/x.go
package p

type I interface {
    M()
}

func f()

func g(i I) {
    f()
    i.M()
}
g% GOARCH=arm GOOS=linux go tool 5g -S /tmp/x.go
"".g t=1 size=52 value=0 args=0x8 locals=0x4
    0x0000 00000 (/tmp/x.go:9)  TEXT    "".g(SB), $4-8
    0x0000 00000 (/tmp/x.go:9)  MOVW    8(g), R1
    0x0004 00004 (/tmp/x.go:9)  CMP R1, R13
    0x0008 00008 (/tmp/x.go:9)  MOVW.LS R14, R3
    0x000c 00012 (/tmp/x.go:9)  CALL.LS runtime.morestack_noctxt(SB)
    0x0010 00016 (/tmp/x.go:9)  BLS 0
    0x0014 00020 (/tmp/x.go:9)  MOVW.W  R14, -8(R13)
    0x0018 00024 (/tmp/x.go:9)  FUNCDATA    $0, gclocals·50fa090b935bc5e8dc32898bb7e4b528(SB)
    0x0018 00024 (/tmp/x.go:9)  FUNCDATA    $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x0018 00024 (/tmp/x.go:10) PCDATA  $0, $0
    0x0018 00024 (/tmp/x.go:10) CALL    "".f(SB)
    0x001c 00028 (/tmp/x.go:11) MOVW    "".i+4(FP), R0
    0x0020 00032 (/tmp/x.go:11) MOVW    R0, 4(R13)
    0x0024 00036 (/tmp/x.go:11) MOVW    "".i(FP), R0
    0x0028 00040 (/tmp/x.go:11) MOVW    20(R0), R0
    0x002c 00044 (/tmp/x.go:11) PCDATA  $0, $0
    0x002c 00044 (/tmp/x.go:11) CALL    (R0)
    0x0030 00048 (/tmp/x.go:12) MOVW.P  8(R13), R15

The problem appears to be that the Raspberry Pi is very slow at indirect calls.
There's not a lot we can do about that.

@rsc rsc closed this as completed Apr 10, 2015
@golang golang locked and limited conversation to collaborators Jun 25, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants