--> Assembly Programmers - Help Axe Optimize! --> -->

Author Topic: Assembly Programmers - Help Axe Optimize!  (Read 157953 times)

0 Members and 1 Guest are viewing this topic.

Offline Happybobjr

  • James Oldiges
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2325
  • Rating: +128/-20
  • Howdy :)
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #240 on: August 30, 2011, 06:23:00 am »
what does that code do though ???
School: East Central High School
 
Axe: 1.0.0
TI-84 +SE  ||| OS: 2.53 MP (patched) ||| Version: "M"
TI-Nspire    |||  Lent out, and never returned
____________________________________________________________

Offline Runer112

  • Project Author
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2289
  • Rating: +639/-31
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #241 on: September 18, 2011, 03:15:23 am »
At this rate, I'll have optimized just about every Axe routine eventually! ;)




p_ToHex: 31 cycles faster.

Code: (Old code: 25 bytes, 670 cycles) [Select]
p_ToHex:
.db __ToHexEnd-$-1
ld b,4
ld de,vx_SptBuff
push de
__ToHexLoop:
ld a,$1F
__ToHexShift:
add hl,hl
rla
jr nc,__ToHexShift
daa
add a,$A0
adc a,$40
ld (de),a
inc de
djnz __ToHexLoop
xor a
ld (de),a
pop hl
ret
__ToHexEnd:
       
   
Code: (New code: 25 bytes, 639 cycles) [Select]
p_ToHex:
.db __ToHexEnd-$-1
ld bc,4<<8+$1F
ld de,vx_SptBuff
__ToHexLoop:
ld a,c
__ToHexShift:
add hl,hl
rla
jr nc,__ToHexShift
daa
add a,$A0
adc a,$40
ld (de),a
inc e
djnz __ToHexLoop
ex de,hl
ld (hl),b
ld l,vx_SptBuff&$FF
ret
__ToHexEnd:
       




p_ShiftLeft: 1 byte smaller, 67 cycles faster. You could save an additional 384 cycles by giving up the minor size savings and loading 12<<8+4 into de at the start of the routine and then replacing the immediate data operands in the loop with d and e.

Code: (Old code: 17 bytes, 27542 cycles) [Select]
p_ShiftLeft:
.db __ShiftLeftEnd-1-$
ld hl,plotSScreen+767
ld c,64
__ShiftLeftLoop:
ld b,12
or a
__ShiftLeftShift:
rl (hl)
dec hl
djnz __ShiftLeftShift
dec c
jr nz,__ShiftLeftLoop
ret
__ShiftLeftEnd:
       
   
Code: (New code: 16 bytes, 27475 cycles) [Select]
p_ShiftLeft:
.db __ShiftLeftEnd-1-$
ld hl,plotSScreen+767
xor a
__ShiftLeftLoop:
ld b,12
__ShiftLeftShift:
rl (hl)
dec hl
djnz __ShiftLeftShift
add a,4
jr nz,__ShiftLeftLoop
ret
__ShiftLeftEnd:
       




p_ShiftRight: 1 byte smaller, 67 cycles faster. Same deal as p_ShiftLeft.

Code: (Old code: 17 bytes, 27542 cycles) [Select]
p_ShiftRight:
.db __ShiftRightEnd-1-$
ld hl,plotSScreen
ld c,64
__ShiftRightLoop:
ld b,12
or a
__ShiftRightShift:
rr (hl)
inc hl
djnz __ShiftRightShift
dec c
jr nz,__ShiftRightLoop
ret
__ShiftRightEnd:
       
   
Code: (New code: 16 bytes, 27475 cycles) [Select]
p_ShiftRight:
.db __ShiftRightEnd-1-$
ld hl,plotSScreen
xor a
__ShiftRightLoop:
ld b,12
__ShiftRightShift:
rr (hl)
inc hl
djnz __ShiftRightShift
add a,4
jr nz,__ShiftRightLoop
ret
__ShiftRightEnd:
       




p_FreqOut: 1 byte smaller. Takes advantage of an absolute jump. This is a strange routine to optimize, because optimizing it results in it running about 15% faster which would result in slightly higher pitched and shorter notes. Although this command is rarely used, this augmentation might still make the optimization not worth it. Whether or not you include the optimization, it might be a good idea to change this routine to use p_Safety.

Code: (Old code: 23 bytes) [Select]
p_FreqOut:
.db __FreqOutEnd-1-$
xor a
__FreqOutLoop1:
push bc
ld e,a
__FreqOutLoop2:
ld a,h
or l
jr z,__FreqOutDone
dec hl
dec bc
ld a,b
or c
jr nz,__FreqOutLoop2
ld a,e
xor %00000011
scf
__FreqOutDone:
pop bc
out ($00),a
ret nc
jr __FreqOutLoop1
__FreqOutEnd:
       
   
Code: (New code: 22 bytes) [Select]
p_FreqOut:
.db __FreqOutEnd-1-$
xor a
__FreqOutLoop1:
push bc
ld e,a
__FreqOutLoop2:
ld a,h
or l
jr z,__FreqOutDone
cpd
jp pe,__FreqOutLoop2
ld a,e
xor %00000011
scf
__FreqOutDone:
pop bc
out ($00),a
ret nc
jr __FreqOutLoop1
__FreqOutEnd:
       




p_IntSetup: 4 bytes smaller. I thought this was some pretty impressive work. ;D And regarding interrupts, I still think the port 6 saving and restoring shenanigans aren't necessary for programs. The only reason port 6 would need to be restored to the value it held when interrupts were enabled is if the user is using a shell application in conjugation with their Axe program. In that case, either the designer of the shell application interface system could provide modified interrupt routines in an Axiom, or the user is probably intelligent enough to be able to provide their own interrupt routines. (Actually it wouldn't even need to be their own, they could just copy the one for applications from the Commands.inc file)

Code: (Old code: 42 bytes, a lot of cycles) [Select]
p_IntSetup:
.db __IntEnd-p_IntSetup-1
di
ld de,$8B01
ld a,d
ld i,a
ld a,l
ld hl,$8B00
ld b,e
ld c,l
ld (hl),$8A
ldir

and %00000110
out (4),a
ld a,%00001000
out (3),a
ld a,(hl)
out (3),a

ld d,a
ld e,a
ld c,__IntDataEnd-__IntData
ld hl,$0000
ldir

in a,(6)
ld ($8A8A+__IntDataSMC-__IntData+1),a
__IntEnd:
.db rp_Ans,9
       
   
Code: (New code: 38 bytes, more cycles but who cares?) [Select]
p_IntSetup:
.db __IntEnd-p_IntSetup-1
di
ld a,l
ld hl,$8C06
ld de,$8C05
ld bc,$8C05-$8A8A

and l
out (4),a
ld a,h
out (3),a
dec a
ld i,a
dec a
out (3),a

ld (hl),a
lddr

ld hl,$0000
ld c,__IntDataEnd-__IntData
ldir

in a,(6)
ld ($8A8A+__IntDataSMC-__IntData+1),a
__IntEnd:
.db rp_Ans,11
       




p_DtoF: 2 bytes smaller. Takes advantage of a bcall to do the same thing. It appears that B_CALL(_SetXXXXOP2) always returns OP2+1, which could be used to save an additional 2 bytes, but this bcall could theoretically be changed in future OS versions and break this optimization.

Code: (Old code: 13 bytes, a lot of cycles) [Select]
p_DtoF:
.db 13
ex (sp),hl
B_CALL(_SetXXXXOP2)
ld hl,OP2
pop de
ld bc,9
ldir
       
   
Code: (New code: 11 bytes, a lot plus a few cycles) [Select]
p_DtoF:
.db 11
ex (sp),hl
B_CALL(_SetXXXXOP2)
ld hl,OP2
pop de
B_CALL(_Mov9B)
       
« Last Edit: September 18, 2011, 03:25:22 am by Runer112 »

Offline calc84maniac

  • eZ80 Guru
  • Coder Of Tomorrow
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2912
  • Rating: +471/-17
    • View Profile
    • TI-Boy CE
Re: Assembly Programmers - Help Axe Optimize!
« Reply #242 on: September 20, 2011, 12:26:00 am »
p_Length: 1 byte smaller, 2 cycles faster. Takes advantage of the fact that you will not need to search more than 16384 bytes starting at $4000-$7FFF or 32768 bytes starting at $8000-$FFFF, and also you shouldn't be searching at $0000-$3FFF.
Code: ((Old code: 11 bytes)) [Select]
p_Length:
.db __LengthEnd-$-1
xor a
ld b,a
ld c,a
cpir
ld hl,-1
sbc hl,bc
ret
__LengthEnd:
Code: ((New code: 10 bytes)) [Select]
p_Length:
.db __LengthEnd-$-1
xor a
ld b,h
ld d,h
ld e,l
cpir
scf
sbc hl,de
ret
__LengthEnd:
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman

Offline jacobly

  • LV5 Advanced (Next: 300)
  • *****
  • Posts: 205
  • Rating: +161/-1
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #243 on: October 09, 2011, 10:16:40 am »
Speed optimization for p_CheckSum by using an absolute jump.
Code: (Old Code: 19 bytes, 63.5*n+37 cycles) [Select]
p_CheckSum:
.db __CheckSumEnd-$-1
ld b,h
ld c,l
pop af
pop hl
push af
xor a
ld d,a
__CheckSumLoop:
add a,(hl)
ld e,a
jr nc,$+3
inc d
cpi
ex de,hl
ret po
ex de,hl
jr __CheckSumLoop
__CheckSumEnd:
Code: (New Code: 19 bytes, 44.5*n+65 cycles) [Select]
p_CheckSum:
.db __CheckSumEnd-$-1
ld b,h
ld c,l
pop af
pop hl
push af
xor a
ld d,a
__CheckSumLoop:
add a,(hl)
jr nc,$+3
inc d
cpi
jp pe,__CheckSumLoop
ld h,d
ld l,a
ret
__CheckSumEnd:

Offline Xeda112358

  • they/them
  • Moderator
  • LV12 Extreme Poster (Next: 5000)
  • ************
  • Posts: 4704
  • Rating: +719/-6
  • Calc-u-lator, do doo doo do do do.
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #244 on: October 09, 2011, 05:38:20 pm »
Hmm, would this optimisation work to save one more byte? (sorry, I could be wrong):
Code: [Select]
p_CheckSum:
.db __CheckSumEnd-$-1
ld b,h
ld c,l
pop hl
ex      (sp),hl
xor a
ld d,a
__CheckSumLoop:
add a,(hl)
jr nc,$+3
inc d
cpi
jp pe,__CheckSumLoop
ld h,d
ld l,a
ret
__CheckSumEnd:

Offline calc84maniac

  • eZ80 Guru
  • Coder Of Tomorrow
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2912
  • Rating: +471/-17
    • View Profile
    • TI-Boy CE
Re: Assembly Programmers - Help Axe Optimize!
« Reply #245 on: October 09, 2011, 07:21:47 pm »
Ah, nice use of ex (sp),hl :D
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman

Offline Xeda112358

  • they/them
  • Moderator
  • LV12 Extreme Poster (Next: 5000)
  • ************
  • Posts: 4704
  • Rating: +719/-6
  • Calc-u-lator, do doo doo do do do.
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #246 on: October 09, 2011, 07:26:47 pm »
Thanks :) I think I learned it from you folks :)
EDIT: It does use 2 more cycles though, right?

Offline calc84maniac

  • eZ80 Guru
  • Coder Of Tomorrow
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2912
  • Rating: +471/-17
    • View Profile
    • TI-Boy CE
Re: Assembly Programmers - Help Axe Optimize!
« Reply #247 on: October 09, 2011, 07:30:34 pm »
Thanks :) I think I learned it from you folks :)
EDIT: It does use 2 more cycles though, right?
Actually, ex (sp),hl takes 2 fewer cycles than pop af and push af combined, so it's faster too :)
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman

Offline Happybobjr

  • James Oldiges
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2325
  • Rating: +128/-20
  • Howdy :)
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #248 on: October 09, 2011, 07:37:42 pm »
what is checksum do?
School: East Central High School
 
Axe: 1.0.0
TI-84 +SE  ||| OS: 2.53 MP (patched) ||| Version: "M"
TI-Nspire    |||  Lent out, and never returned
____________________________________________________________

Offline calc84maniac

  • eZ80 Guru
  • Coder Of Tomorrow
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2912
  • Rating: +471/-17
    • View Profile
    • TI-Boy CE
Re: Assembly Programmers - Help Axe Optimize!
« Reply #249 on: October 13, 2011, 11:32:57 am »
Here, slightly optimized Bitmap():
Old code, 7 bytes and lots of cycles
Code: [Select]
p_EzSprite:
.db 7
pop de
ld a,e
pop de
ld d,a
B_CALL(_DisplayImage)

New code, 6 bytes and lots of cycles minus 4 :P
Code: [Select]
p_EzSprite:
.db 6
pop bc
pop de
ld d,c
B_CALL(_DisplayImage)
« Last Edit: October 13, 2011, 11:33:23 am by calc84maniac »
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman

Offline Xeda112358

  • they/them
  • Moderator
  • LV12 Extreme Poster (Next: 5000)
  • ************
  • Posts: 4704
  • Rating: +719/-6
  • Calc-u-lator, do doo doo do do do.
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #250 on: October 14, 2011, 02:54:36 pm »
Is this an optimisation? I get the feeling that there is a reason it doesn't end in an ret and that it uses a jr...

Code: (Old Code: 7 bytes, 30 or 38 cycles) [Select]
p_DecWord:
.db 7
ld a,(hl)
dec (hl)
or a
jr nz,$+4
inc hl
dec (hl)
Code: (New Code: 6 bytes, 29 or 36) [Select]
p_DecWord:
.db 6
ld a,(hl)
dec (hl)
or a
ret nz
inc hl
dec (hl)

EDIT Yep, suspicion confirmed XD

Offline Quigibo

  • The Executioner
  • CoT Emeritus
  • LV11 Super Veteran (Next: 3000)
  • *
  • Posts: 2031
  • Rating: +1075/-24
  • I wish real life had a "Save" and "Load" button...
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #251 on: November 04, 2011, 01:58:14 am »
Not an optimization, but I'm posting this here since more assembly people will read it.  Since the Bitmap() command is being replaced with something actually useful, that means the "Fix 8" and "Fix 9" will also need to be replaced.  Are there any useful flags (particularly for text) that would be useful to Axe programmers that I haven't already covered with the other fix commands?  A couple I can think of are an APD toggle or Lowercase toggle.
___Axe_Parser___
Today the calculator, tomorrow the world!

Offline LincolnB

  • Check It Out Now
  • LV9 Veteran (Next: 1337)
  • *********
  • Posts: 1115
  • Rating: +125/-4
  • By Hackers For Hackers
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #252 on: November 04, 2011, 10:24:39 am »
Hm...I say this as an Axe programmer, not knowing ASM...how about UPSIDE DOWN TEXT! om nom nom nom
Completed Projects:
   >> Spacky Emprise   >> Spacky 2 - Beta   >> Fantastic Sam
   >> An Exercise In Futility   >> GeoCore

My Current Projects:

Projects in Development:
In Medias Res - Contest Entry

Talk to me if you need help with Axe coding.


Spoiler For Bragging Rights:
Not much yet, hopefully this section will grow soon with time (and more contests)



Offline jacobly

  • LV5 Advanced (Next: 300)
  • *****
  • Posts: 205
  • Rating: +161/-1
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #253 on: November 15, 2011, 12:01:37 am »
p_Input: saves three bytes and lots of cycles
Code: [Select]
p_Input:
.db __InputEnd-$-1
res 6,(iy+$1C)
set 7,(iy+$09)
xor a
ld (ioPrompt),a
B_CALL(_GetStringInput)
B_CALL(_ZeroOP1)
ld hl,$2D04
ld (OP1),hl
B_CALL(_ChkFindSym)
inc de
inc de
ex de,hl
ret
__InputEnd:
Code: [Select]
p_Input:
.db __InputEnd-$-1
res 6,(iy+$1C)
set 7,(iy+$09)
xor a
ld (ioPrompt),a
B_CALL(_GetStringInput)
B_CALL(_ZeroOP1)
ld a,$2D
ld (OP1+1),a
rst rFindSym
inc de
inc de
ex de,hl
ret
__InputEnd:

Offline Quigibo

  • The Executioner
  • CoT Emeritus
  • LV11 Super Veteran (Next: 3000)
  • *
  • Posts: 2031
  • Rating: +1075/-24
  • I wish real life had a "Save" and "Load" button...
    • View Profile
Re: Assembly Programmers - Help Axe Optimize!
« Reply #254 on: November 16, 2011, 05:52:32 pm »
Thanks! :D
___Axe_Parser___
Today the calculator, tomorrow the world!

 

\n\t\t\t\t\t\t\t\t\t
<' + '/div>\n\t\t\t\t\t\t\t\t\t