|
Post by huckle on Oct 21, 2006 17:23:39 GMT -5
I'm trying to recreate one of my amiga intros on the DTV (sinescroller) . I'm running out of screen time however. I'm looking for some general optimization tips on the 64. This loop is what's killing my program: (this is before I've even added the code to fetch and add the sine table info to the loop)
lda #160-01 ; screen width loop sta screenlp
loop7 ldx #8-1 ; mask loop loop6 ldy #8-1 ; char loop loop5 lda (mem1),y ; Mask and splice data onto display and mask,x ora (mem2),y sta (mem2),y dey bpl loop5 dex bpl loop6 inc mem1 inc mem2 dec screenlp bpl loop7 rts
|
|
|
Post by Robin Harbron on Oct 21, 2006 20:16:11 GMT -5
Self-modifying code would save some cycles. Try something like this: lda #160-01 ; screen width loop sta screenlp
loop7 ldx #8-1 ; mask loop lda mask,x sta sm1+1 loop6 ldy #8-1 ; char loop loop5 lda $ffff,y ; Mask and splice data onto display sm1 and #$ff sm2 ora $ffff,y sm3 sta $ffff,y dey bpl loop5 dex bpl loop6 inc loop5+1 inc sm2+1 inc sm3+1 dec screenlp bpl loop7 rts That should save around 8 cycles on the inside loop, while adding a few cycles outside (still definitely worth it). I wasn't able to test it, but it should work to do the mask the way I've changed it, so the 4 cycle lookup and mask,x (same result 8 times in a row) can just be done once per x loop, and the rest of the time a 2 cycle immediate AND can be done. Let me know if you need anything explained.
|
|
|
Post by huckle on Oct 23, 2006 19:53:51 GMT -5
Thanks, I will certainly have a go with this. Hopefully it will work OK with the DTV's burst mode. I may re-investigate using the blitter to do this, but I need 3 source channels and it only has 2.
|
|
|
Post by gmoon on Oct 25, 2006 6:39:27 GMT -5
Maybe blitting twice would be faster than 65xx coding?
What are the three sources (is transparency bug part of the problem?)
|
|
|
Post by huckle on Oct 25, 2006 7:59:08 GMT -5
I need a function whereby if a bit is set in the mask, the data is copied through, if a bit is not set in the mask the original data is copied through. So i want a source channel pointing to the destination.
For example:
Channel A=Source Channel B=mask Channel C=Screen destination Channel D=Screen destination
D=A 'AND' B OR 'NOT A' C
I bascially want to be able to write to the same cell preserving the original data.
|
|
|
Post by gmoon on Oct 27, 2006 8:41:04 GMT -5
So I take it this is VIC style (one or two bits = one pixel) screen? Guess you couldn't use transparencly mode for that...
Have you tried using a a separate copy of the background as A source, and ORing the B channel to the destination? Would depend on how static the background is.
|
|
|
Post by huckle on Mar 22, 2007 19:28:37 GMT -5
Has there ever been a true 'sine scroller' coded for the 64? Such as you would see on the Amiga. I'm really struggling with this. If you have any code please share..
|
|
|
Post by Robin Harbron on Mar 22, 2007 19:46:56 GMT -5
Has there ever been a true 'sine scroller' coded for the 64? Such as you would see on the Amiga. I'm really struggling with this. If you have any code please share.. Could you share a screen-shot so I know for sure what you mean? It might be what I coded about 10 years back in a demo called "Spin". You can grab it here: noname.c64.org/csdb/release/?id=4534It's pretty slow and clunky, and surely not the best example, but if it's what you're looking for, we can talk about it (make sure you move the joystick left/right to see the 3 different scroll patterns, and up/down to change the speed)
|
|
|
Post by tlr on Mar 23, 2007 3:42:12 GMT -5
Has there ever been a true 'sine scroller' coded for the 64? Such as you would see on the Amiga. I'm really struggling with this. If you have any code please share.. @robin: he means like this: Dypp Ecstasy. On the DTV you can probably make it without tricks if it's small enough and you enable skip cycle + burst. You still need to employ heavy optimization though. The obvious one is loop unrolling. In your example, rewrite as (for multicolor): ldy sine lda mem1 and #%11000000 ora mem2,y sta mem2,y lda mem1+1 and #%11000000 ora mem2+1,y sta mem2+1,y lda mem1+2 and #%11000000 ora mem2+2,y sta mem2+2,y .... lda mem1+7 and #%11000000 ora mem2+7,y sta mem2+7,y
ldy sine+1 lda mem1 and #%00110000 ora mem2,y sta mem2,y .... lda mem1+7 and #%00110000 ora mem2+7,y sta mem2+7,y ... and so on... rts You can even precalculate 4 different scrolls with the AND already applied eliminating the need for it in the above. These scrolls can be scrolled by the blitter, and then have the sine applied using this routine. Note that it will be faster to do it in 256 bpp mode, because then you won't need to mask at all. Each byte is a pixel: ldy sine lda mem1 sta mem2,y lda mem1+1 sta mem2+1,y lda mem1+2 sta mem2+2,y .... lda mem1+7 sta mem2+7,y
ldy sine+1 lda mem1 sta mem2,y .... lda mem1+7 sta mem2+7,y ... and so on... rts
|
|
|
Post by huckle on Mar 24, 2007 14:36:03 GMT -5
Thanks. That is exactly what i meant by a 'sine scroller'. i'll have a look at your suggestions, although i'm currently using 320*200 2 colour mode (despite my example code at the start of the thread having a screen width of 160)
|
|